Optimizing Memory Efficiency for Deep Convolutional Neural Network Accelerators

被引:1
|
作者
Li, Xiaowei [1 ]
Li, Jiajun
Yan, Guihai [1 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep Convolutional Neural Networks; Accelerator Architecture; Memory Efficiency;
D O I
10.1166/jolpe.2018.1580
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Convolutional Neural Network (CNN) accelerators have achieved nominal performance and energy efficiency speedup compared to traditional general purpose CPU- and GPU-based solutions. Although optimizations on computation have been intensively studied, the energy efficiency of such accelerators remains limited by off-chip memory accesses since their energy cost is magnitudes higher than other operations. Minimizing off-chip memory access volume, therefore, is the key to further improving energy efficiency. The prior state-of-the-art uses rigid data reuse patterns and is sub-optimal for some, or even all, of the individual convolutional layers. To overcome the problem, this paper proposed an adaptive layer partitioning and scheduling scheme, called SmartShuttle, to minimize off-chip memory accesses for CNN accelerators. Smartshuttle can adaptively switch among different data reuse schemes and the corresponding tiling factor settings to dynamically match different convolutional layers and fully-connected layers. Moreover, SmartShuttle thoroughly investigates the impact of data reusability and sparsity on the memory access volume. The experimental results show that SmartShuttle processes the convolutional layers at 434.8 multiply and accumulations (MACs)/DRAM access for VGG16 (batch size = 3), and 526.3 MACs/DRAM access for AlexNet (batch size = 4), which outperforms the state-of-the-art approach (Eyeriss) by 52.2% and 52.6%, respectively.
引用
收藏
页码:496 / 507
页数:12
相关论文
共 50 条
  • [1] GShuttle: Optimizing Memory Access Efficiency for Graph Convolutional Neural Network Accelerators
    Li, Jia-Jun
    Wang, Ke
    Zheng, Hao
    Louri, Ahmed
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2023, 38 (01) : 115 - 127
  • [2] GShuttle: Optimizing Memory Access Efficiency for Graph Convolutional Neural Network Accelerators
    Jia-Jun Li
    Ke Wang
    Hao Zheng
    Ahmed Louri
    Journal of Computer Science and Technology, 2023, 38 : 115 - 127
  • [3] Memory Bandwidth and Energy Efficiency Optimization of Deep Convolutional Neural Network Accelerators
    Nie, Zikai
    Li, Zhisheng
    Wang, Lei
    Guo, Shasha
    Dou, Qiang
    ADVANCED COMPUTER ARCHITECTURE, 2018, 908 : 15 - 29
  • [4] Optimizing Memory Efficiency for Deep Convolutional Neural Networks on GPUs
    Li, Chao
    Yang, Yi
    Feng, Min
    Chakradhar, Srimat
    Zhou, Huiyang
    SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2016, : 633 - 644
  • [5] Memory Requirements for Convolutional Neural Network Hardware Accelerators
    Siu, Kevin
    Stuart, Dylan Malone
    Mahmoud, Mostafa
    Moshovos, Andreas
    2018 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC), 2018, : 111 - 121
  • [6] Improving Memory Utilization in Convolutional Neural Network Accelerators
    Jokic, Petar
    Emery, Stephane
    Benini, Luca
    IEEE EMBEDDED SYSTEMS LETTERS, 2021, 13 (03) : 77 - 80
  • [7] A Survey on Memory Subsystems for Deep Neural Network Accelerators
    Asad, Arghavan
    Kaur, Rupinder
    Mohammadi, Farah
    FUTURE INTERNET, 2022, 14 (05):
  • [8] Optimizing deep learning inference on mobile devices with neural network accelerators
    曾惜
    Xu Yunlong
    Zhi Tian
    HighTechnologyLetters, 2019, 25 (04) : 417 - 425
  • [9] Optimizing OpenCL Implementation of Deep Convolutional Neural Network on FPGA
    Qiao, Yuran
    Shen, Junzhong
    Huang, Dafei
    Yang, Qianming
    Wen, Mei
    Zhang, Chunyuan
    NETWORK AND PARALLEL COMPUTING (NPC 2017), 2017, 10578 : 100 - 111
  • [10] Using Data Compression for Optimizing FPGA-Based Convolutional Neural Network Accelerators
    Guan, Yijin
    Xu, Ningyi
    Zhang, Chen
    Yuan, Zhihang
    Cong, Jason
    ADVANCED PARALLEL PROCESSING TECHNOLOGIES, 2017, 10561 : 14 - 26