Nebula: A Scalable and Flexible Accelerator for DNN Multi-Branch Blocks on Embedded Systems

被引:0
作者
Yang, Dawei [1 ]
Li, Xinlei [2 ]
Qi, Lizhe [1 ]
Zhang, Wenqiang [1 ]
Jiang, Zhe [3 ]
机构
[1] Fudan Univ, Acad Engn & Technol, Shanghai 200433, Peoples R China
[2] Shanghai Univ Int Business & Econ, Sch Stat & Informat, Shanghai 201620, Peoples R China
[3] Univ Cambridge, Comp Sci, Cambridge CB3 0FD, England
关键词
DNN accelerators; multi-branch network; energy-efficient accelerators;
D O I
10.3390/electronics11040505
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks (DNNs) are widely used in many artificial intelligence applications; many specialized DNN-inference accelerators have been proposed. However, existing DNN accelerators rely heavily on certain types of DNN operations (such as Conv, FC, and ReLU, etc.), which are either less used or likely to become out of date in future, posing challenges of flexibility and compatibility to existing work. This paper designs a flexible DNN accelerator from a more generic perspective rather than speeding up certain types of DNN operations. Our proposed Nebula exploits the width property of DNNs and gains a significant improvement in system throughput and energy efficiency over multi-branch architectures. Nebula is a first-of-its-kind framework for multi-branch DNNs.
引用
收藏
页数:13
相关论文
共 33 条
  • [1] Bellaouar A., 2012, LOW POWER DIGITAL VL
  • [2] DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning
    Chen, Tianshi
    Du, Zidong
    Sun, Ninghui
    Wang, Jia
    Wu, Chengyong
    Chen, Yunji
    Temam, Olivier
    [J]. ACM SIGPLAN NOTICES, 2014, 49 (04) : 269 - 283
  • [3] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
    Chen, Yu-Hsin
    Krishna, Tushar
    Emer, Joel S.
    Sze, Vivienne
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138
  • [4] Cong Jason, 2014, Artificial Neural Networks and Machine Learning - ICANN 2014. 24th International Conference on Artificial Neural Networks. Proceedings: LNCS 8681, P281, DOI 10.1007/978-3-319-11179-7_36
  • [5] Designing efficient accelerator of depthwise separable convolutional neural network on FPGA
    Ding, Wei
    Huang, Zeyu
    Huang, Zunkai
    Tian, Li
    Wang, Hui
    Feng, Songlin
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 97 : 278 - 286
  • [6] Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1
  • [7] Ham T.J., P 2020 IEEE INT S HI
  • [8] Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1026 - 1034
  • [9] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [10] Jiang Z., P 2020 IEEE REAL TIM, P38