Energy Efficient Hardware Loop Based Optimization for CGRAs

被引:1
作者
Sunny, Chilankamol [1 ]
Das, Satyajit [1 ]
Martin, Kevin J. M. [2 ]
Coussy, Philippe [2 ]
机构
[1] IIT Palakkad, Palakkad, Kerala, India
[2] Univ Bretagne Sud, UMR 6285, Lab STICC, F-56100 Lorient, France
来源
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY | 2022年 / 94卷 / 09期
关键词
Coarse grained reconfigurable array (CGRA); Loop optimization; Hardware loop; Loop unrolling; POWER;
D O I
10.1007/s11265-022-01760-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Research interest and industry investment in edge computing solutions have increased dramatically in recent years. Consequent quest for balanced performance, energy efficiency and flexibility bestowed surging popularity on Coarse Grained Reconfigurable Array (CGRA) architectures. To further improve the performance and energy efficiency, several hardware and software-based loop optimizations are adopted for CGRAs. In this paper, we propose a centralized hardware-based loop optimization technique to achieve better area and energy results compared to the previously implemented distributed version. Without incurring any performance degradation, area overhead against the reference architecture is reduced down to 1.5% for a 4x2 CGRA configuration. A maximum of 47.3% and an arithmetic mean of 27.2% reduction in energy consumption is attained by the centralized version of hardware loop compared to the baseline model employing software loop. Furthermore, the paper explores the co-existence of CGRA-specific hardware and software optimizations and their impact on loop efficiencies. Enhanced results are obtained by coupling loop unrolling with centralized hardware loop support. The combination allows achieving up to 68.7% reduction in energy consumption and 5.46x speed-up against the baseline model with no optimizations applied.
引用
收藏
页码:895 / 912
页数:18
相关论文
共 27 条
  • [1] Instruction buffering to reduce power in processors for signal processing
    Bajwa, RS
    Hiraki, M
    Kojima, H
    Gorny, DJ
    Nitta, K
    Shridhar, A
    Seki, K
    Sasaki, K
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 1997, 5 (04) : 417 - 424
  • [2] Balasubramanian M, 2018, DES AUT TEST EUROPE, P1069, DOI 10.23919/DATE.2018.8342170
  • [3] Insight into tiles generated by means of a correction technique
    Bielecki, Wlodzimierz
    Skotnicki, Piotr
    [J]. JOURNAL OF SUPERCOMPUTING, 2019, 75 (05) : 2665 - 2690
  • [4] Das S., 2018, THESIS LORIENT
  • [5] An Energy-Efficient Integrated Programmable Array Accelerator and Compilation Flow for Near-Sensor Ultralow Power Processing
    Das, Satyajit
    Martin, Kevin J. M.
    Rossi, Davide
    Coussy, Philippe
    Benini, Luca
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2019, 38 (06) : 1095 - 1108
  • [6] A Heterogeneous Cluster with Reconfigurable Accelerator for Energy Efficient Near-Sensor Data Analytics
    Das, Satyajit
    Martin, Kevin J. M.
    Coussy, Philippe
    Rossi, Davide
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2018,
  • [7] Das S, 2017, ASIA S PACIF DES AUT, P127, DOI 10.1109/ASPDAC.2017.7858308
  • [8] Dragomir O.S., 2010, ARCHITECTURES COMPIL, P6164
  • [9] Near-Threshold RISC-VCore With DSP Extensions for Scalable IoT Endpoint Devices
    Gautschi, Michael
    Schiavone, Pasquale Davide
    Traber, Andreas
    Loi, Igor
    Pullini, Antonio
    Rossi, Davide
    Flamand, Eric
    Gurkaynak, Frank K.
    Benini, Luca
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2017, 25 (10) : 2700 - 2713
  • [10] SNAFU: An Ultra-Low-Power, Energy-Minimal CGRA-Generation Framework and Architecture
    Gobieski, Graham
    Atli, Ahmet Oguz
    Mai, Kenneth
    Lucia, Brandon
    Beckmann, Nathan
    [J]. 2021 ACM/IEEE 48TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2021), 2021, : 1027 - 1040