Multi-Bank On-Chip Memory Management Techniques for CNN Accelerators

被引：6

作者：

Kang, Duseok ^{[1
]}

Kang, Donghyun ^{[1
]}

Ha, Soonhoi ^{[1
]}

机构：

[1] Seoul Natl Univ, Dept Comp Engn, Seoul 08826, South Korea

来源：

IEEE TRANSACTIONS ON COMPUTERS | 2022年 / 71卷 / 05期

关键词：

System-on-chip; Random access memory; Convolution; Memory management; Delays; Frequency modulation; Prefetching; Convolutional neural network; multi-bank memory management; layer fusion; prefetching; data reuse; accelerator;

D O I：

10.1109/TC.2021.3076987

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Since off-chip DRAM access affects both performance and power consumption significantly, convolutional neural network (CNN) accelerators commonly aim to maximize data reuse in on-chip memory. By organizing the on-chip memory to multiple banks, we may hide off-chip DRAM access delay by prefetching data to unused banks during computation. When and where to prefetch data and how to reuse the feature map data between layers define the multi-bank on-chip memory management (MOMM) problem. In this paper, we propose compiler techniques to solve the MOMM problem with two different objectives: one is to minimize the off-chip memory access volume, and the other is to minimize the processing delay caused by unhidden DRAM accesses. By running CNN benchmarks on a cycle-level NPU simulator, we demonstrate the trade-off relation between two objectives. Compared with the baseline approach that does not reuse the feature map between layers, we could reduce the DRAM access volume and the processing delay up to 55.0 and 79.4 percent, respectively. Moreover, we extend the proposed techniques to consider layer fusion that aims to reuse feature maps between layers. Experiment results confirm the superiority of the proposed hybrid fusion technique to the per-layer processing technique and the pure fusion technique.

引用

页码：1181 / 1193

页数：13

共 50 条

[21] TelaMalloc: Efficient On-Chip Memory Allocation for Production Machine Learning Accelerators
Maas, Martin
Beaugnon, Ulysse
Chauhan, Arun
Ilbeyi, Berkin
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, VOL 1, ASPLOS 2023, 2023, : 123 - 137
[22] A Fully Parallel Content Addressable Memory Design Using Multi-Bank Structure
Jiang, Shixiong
Saravanan, Vijayalakshmi
Yan, Pengzhan
Sridhar, Ramalingam
2016 29TH IEEE INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC), 2016, : 338 - 343
[23] Joint Modulo Scheduling and Memory Partitioning with Multi-Bank Memory for High-Level Synthesis
Lu, Tianyi
Yin, Shouyi
Yao, Xianqing
Xie, Zhicong
Liu, Leibo
Wei, Shaojun
FPGA'17: PROCEEDINGS OF THE 2017 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, 2017, : 290 - 290
[24] Automatic data migration for reducing energy consumption in multi-bank memory systems
De La Luz, V
Kandemir, M
Kolcu, I
39TH DESIGN AUTOMATION CONFERENCE, PROCEEDINGS 2002, 2002, : 213 - 218
[25] Multi-Bank Memory Aware Force Directed Scheduling for High-Level Synthesis
Yin, Shouyi
Lu, Tianyi
Yao, Xianqing
Xie, Zhicong
Liu, Leibo
Wei, Shaojun
IEEE ACCESS, 2018, 6 : 7526 - 7540
[26] Compiler-Based Performance Evaluation of an SIMD Processor with a Multi-Bank Memory Unit
Hoseok Chang
Junho Cho
Wonyong Sung
Journal of Signal Processing Systems, 2009, 56 : 249 - 260
[27] The hierarchical multi-bank DRAM: A high-performance architecture for memory integrated with processors
Yamauchi, T
Hammond, L
Olukotun, K
SEVENTEENTH CONFERENCE ON ADVANCED RESEARCH IN VLSI, PROCEEDINGS, 1997, : 303 - 319
[28] Tolerating Soft Errors in Deep Learning Accelerators with Reliable On-Chip Memory Designs
Azizimazreah, Arash
Gu, Yongbin
Gu, Xiang
Chen, Lizhong
2018 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE AND STORAGE (NAS), 2018,
[29] DNNOPT: A Framework for Efficiently Selecting On-chip Memory Loop Optimizations of DNN Accelerators
Ranawaka, Piyumal
Azhar, Muhammad Waqar
Stenstrom, Per
PROCEEDINGS OF THE 21ST ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2024, CF 2024, 2024, : 126 - 137
[30] Memory access scheduling and binding considering energy minimisation in multi-bank memory systems: integrated approach
Lyuh, CG
Kim, T
IEE PROCEEDINGS-COMPUTERS AND DIGITAL TECHNIQUES, 2006, 153 (01): : 59 - 68

← 1 2 3 4 5 →