Bit Prudent In-Cache Acceleration of Deep Convolutional Neural Networks

被引:29
|
作者
Wang, Xiaowei [1 ]
Yu, Jiecao [1 ]
Augustine, Charles [2 ]
Iyer, Ravi [2 ]
Das, Reetuparna [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Intel Corp, Santa Clara, CA 95051 USA
基金
美国国家科学基金会;
关键词
In-Memory Computing; Cache; Neural Network Pruning; Low Precision Neural Network;
D O I
10.1109/HPCA.2019.00029
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We propose Bit Prudent In-Cache Acceleration of Deep Convolutional Neural Networks - an in-SRAM architecture for accelerating Convolutional Neural Network (CNN) inference by leveraging network redundancy and massive parallelism. The network redundancy is exploited in two ways. First, we prune and fine-tune the trained network model and develop two distinct methods - coalescing and overlapping to run inferences efficiently with sparse models. Second, we propose an architecture for network models with a reduced bit width by leveraging bit-serial computation. Our proposed architecture achieves a 17.7x/3.7x speedup over server class CPU/GPU, and a 1.6x speedup compared to the relevant in-cache accelerator, with 2% area overhead each processor die, and no loss on top-1 accuracy for AlexNet. With a relaxed accuracy limit, our tunable architecture achieves higher speedups.
引用
收藏
页码:81 / 93
页数:13
相关论文
共 50 条
  • [21] DEEP CONVOLUTIONAL NEURAL NETWORKS FOR LVCSR
    Sainath, Tara N.
    Mohamed, Abdel-rahman
    Kingsbury, Brian
    Ramabhadran, Bhuvana
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8614 - 8618
  • [22] Deep Unitary Convolutional Neural Networks
    Chang, Hao-Yuan
    Wang, Kang L.
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT II, 2021, 12892 : 170 - 181
  • [23] Universality of deep convolutional neural networks
    Zhou, Ding-Xuan
    APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2020, 48 (02) : 787 - 794
  • [24] A Review on Deep Convolutional Neural Networks
    Aloysius, Neena
    Geetha, M.
    2017 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2017, : 588 - 592
  • [25] Spatial deep convolutional neural networks
    Wang, Qi
    Parker, Paul A.
    Lund, Robert
    SPATIAL STATISTICS, 2025, 66
  • [26] Convergence of deep convolutional neural networks
    Xu, Yuesheng
    Zhang, Haizhang
    NEURAL NETWORKS, 2022, 153 : 553 - 563
  • [27] Fusion of Deep Convolutional Neural Networks
    Suchy, Robert
    Ezekiel, Soundararajan
    Cornacchia, Maria
    2017 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR), 2017,
  • [28] A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration
    Ghimire, Deepak
    Kil, Dayoung
    Kim, Seong-heum
    ELECTRONICS, 2022, 11 (06)
  • [29] A Fourier domain acceleration framework for convolutional neural networks
    Lin, Jinhua
    Ma, Lin
    Yao, Yu
    NEUROCOMPUTING, 2019, 364 : 254 - 268
  • [30] Acceleration and implementation of convolutional neural networks based on FPGA
    Zhao, Sijie
    Gao, Shangshang
    Wang, Rugang
    Wang, Yuanyuan
    Zhou, Feng
    Guo, Naihong
    DIGITAL SIGNAL PROCESSING, 2023, 141