Loop restructuring for data I/O minimization on limited on-chip memory embedded processors

被引:2
|
作者
Tembe, W
Pande, S
机构
[1] Univ Cincinnati, Dept Elect & Comp Engn & Comp Sci, Cincinnati, OH 45221 USA
[2] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
loop fusion; limited memory; embedded processors; data locality; program dependence graph;
D O I
10.1109/TC.2002.1039852
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a framework for analyzing the flow of values and their reuse in loop nests to minimize data traffic under the constraints of I limited on-chip memory capacity and dependences. Our analysis first undertakes fusion of possible loop nests intra-procedurally and then performs loop distribution. The analysis discovers the closeness factor of two statements which is a quantitative measure of data traffic saved per unit memory occupied if the statements, were under the same loop nest over the case where they are under different loop nests. We then develop a greedy algorithm which traverses the program dependence graph (PDG) to group statements together under the same loop nest legally to promote maximal reuse per unit of memory occupied. We implemented our framework in Petit [4], a tool for dependence analysis and loop transformations. We compared our method with one based on tiling of fused loop nest and one based on a greedy strategy to purely maximize reuse. We show that our methods work better than both of these strategies in most cases for processors such as TMS320Cxx, which have a very limited amount of on-chip memory. The improvements in data I/O range from 10 to 30 percent over tiling and from 10 to 40 percent over maximal reuse for JPEG loops.
引用
收藏
页码:1269 / 1280
页数:12
相关论文
共 50 条
  • [1] A framework for loop distribution on limited on-chip memory processors
    Wang, L
    Tembe, W
    Pande, S
    COMPILER CONSTRUCTION, PROCEEDINGS, 2000, 1781 : 141 - 156
  • [2] Automatic on-chip memory minimization for data reuse
    Liu, Qiang
    Constantinides, George A.
    Masselos, Konstantinos
    Cheung, Peter Y. K.
    FCCM 2007: 15TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2007, : 251 - +
  • [3] On-chip processors broaden embedded designers' choices
    Bursky, D
    ELECTRONIC DESIGN, 2000, 48 (17) : 68 - +
  • [4] On-chip Instrumentation for Runtime Verification in Deeply Embedded Processors
    MacNamee, Ciaran
    Heffernan, Donal
    2015 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI, 2015, : 374 - 379
  • [5] On-chip memory management for embedded MpSoC architectures based on data compression
    Ozturk, O
    Kandemir, M
    Irwin, MJ
    Tosun, S
    IEEE INTERNATIONAL SOC CONFERENCE, PROCEEDINGS, 2005, : 175 - 178
  • [6] Bandwidth Optimization Through On-Chip Memory Restructuring for HLS
    Cong, Jason
    Wei, Peng
    Yu, Cody Hao
    Zhou, Peipei
    PROCEEDINGS OF THE 2017 54TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2017,
  • [7] Run-time loop restructuring for on-chip parallel processor
    Tamatsukuri, J
    Matsumoto, T
    Hiraki, K
    INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-IV, PROCEEDINGS, 1998, : 1489 - 1496
  • [8] Increasing on-chip memory space utilization for embedded chip multiprocessors through data compression.
    Ozturk, O
    Kandemir, M
    Irwin, MJ
    2005 INTERNATIONAL CONFERENCE ON HARDWARE/SOFTWARE CODESIGN AND SYSTEM SYNTHESIS, 2005, : 87 - 92
  • [9] Exploiting On-Chip Data Behavior for Delay Minimization
    Satyanarayana, Nallamothu
    Mutyam, Madhu
    Babu, A. Vinaya
    PROCEEDINGS OF SLIP '07: 2007 INTERNATIONAL WORKSHOP ON SYSTEM LEVEL INTERCONNECT PREDICTION, 2007, : 103 - +
  • [10] A Chip-Stacked Memory for On-Chip SRAM-Rich SoCs and Processors
    Saito, Hideaki
    Nakajima, Masayuki
    Okamoto, Takumi
    Yamada, Yusuke
    Ohuchi, Akira
    Iguchi, Noriyuki
    Sakamoto, Toshitsugu
    Yamaguchi, Koichi
    Mizuno, Masayuki
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2010, 45 (01) : 15 - 22