Multi-Bank On-Chip Memory Management Techniques for CNN Accelerators

被引：6

作者：

Kang, Duseok ^{[1
]}

Kang, Donghyun ^{[1
]}

Ha, Soonhoi ^{[1
]}

机构：

[1] Seoul Natl Univ, Dept Comp Engn, Seoul 08826, South Korea

来源：

IEEE TRANSACTIONS ON COMPUTERS | 2022年 / 71卷 / 05期

关键词：

System-on-chip; Random access memory; Convolution; Memory management; Delays; Frequency modulation; Prefetching; Convolutional neural network; multi-bank memory management; layer fusion; prefetching; data reuse; accelerator;

D O I：

10.1109/TC.2021.3076987

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Since off-chip DRAM access affects both performance and power consumption significantly, convolutional neural network (CNN) accelerators commonly aim to maximize data reuse in on-chip memory. By organizing the on-chip memory to multiple banks, we may hide off-chip DRAM access delay by prefetching data to unused banks during computation. When and where to prefetch data and how to reuse the feature map data between layers define the multi-bank on-chip memory management (MOMM) problem. In this paper, we propose compiler techniques to solve the MOMM problem with two different objectives: one is to minimize the off-chip memory access volume, and the other is to minimize the processing delay caused by unhidden DRAM accesses. By running CNN benchmarks on a cycle-level NPU simulator, we demonstrate the trade-off relation between two objectives. Compared with the baseline approach that does not reuse the feature map between layers, we could reduce the DRAM access volume and the processing delay up to 55.0 and 79.4 percent, respectively. Moreover, we extend the proposed techniques to consider layer fusion that aims to reuse feature maps between layers. Experiment results confirm the superiority of the proposed hybrid fusion technique to the per-layer processing technique and the pure fusion technique.

引用

页码：1181 / 1193

页数：13

共 50 条

[1] A Multi-Cache System for On-Chip Memory Optimization in FPGA-Based CNN Accelerators
Pacini, Tommaso
Rapuano, Emilio
Dinelli, Gianmarco
Fanucci, Luca
ELECTRONICS, 2021, 10 (20)
[2] A Case of On-Chip Memory Subsystem Design for Low-Power CNN Accelerators
Wang, Ying
Li, Huawei
Li, Xiaowei
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (10) : 1971 - 1984
[3] Energy optimization of a multi-bank main memory
Ben Fradj, Hanene
Icart, Sebastien
Belleudy, Cecile
Auguin, Michel
EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, PROCEEDINGS, 2006, 4017 : 196 - 205
[4] On Fault-Tolerant Microarchitectural Techniques for Voltage Underscaling in On-Chip Memories of CNN Accelerators
Toca-Diaz, Yamilka
Munoz, Nicolas Landeros
Gran Tejero, Ruben
Valero, Alejandro
2023 26TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN, DSD 2023, 2023, : 138 - 145
[5] Multi-bank memory allocation for multimedia application
Ben Fradj, Hanene
Belleudy, Cecile
Auguin, Michel
Pegatoquet, Alain
2006 13TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS, VOLS 1-3, 2006, : 780 - +
[6] Improving off-chip memory energy behavior in a multi-processor, multi-bank environment
De La Luz, V
Kandemir, M
Sezer, U
LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2003, 2624 : 100 - 114
[7] Loop scheduling and bank type assignment for heterogeneous multi-bank memory
Qiu, Meikang
Guo, Minyi
Liu, Meiqin
Xue, Chun Jason
Yang, Laurence T.
Sha, Edwin H. -M.
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2009, 69 (06) : 546 - 558
[8] Operation and Data Mapping for CGRAs with Multi-bank Memory
Kim, Yongjoo
Lee, Jongeun
Shrivastava, Aviral
Paek, Yunheung
ACM SIGPLAN NOTICES, 2010, 45 (04) : 17 - 25
[9] Minimising Access Conflicts on Shared Multi-Bank Memory
Tretter, Andreas
Giannopoulou, Georgia
Baer, Matthias
Thiele, Lothar
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2017, 16
[10] Operation and Data Mapping for CGRAs with Multi-bank Memory
Kim, Yongjoo
Lee, Jongeun
Shrivastava, Aviral
Paek, Yunheung
LCTES 10-PROCEEDINGS OF THE ACM SIGPLAN/SIGBED 2010 CONFERENCE ON LANGUAGES, COMPILERS, & TOOLS FOR EMBEDDED SYSTEMS, 2010, : 17 - 25

← 1 2 3 4 5 →