Nebula: A Scalable and Flexible Accelerator for DNN Multi-Branch Blocks on Embedded Systems

被引：0

作者：

Yang, Dawei ^{[1
]}

Li, Xinlei ^{[2
]}

Qi, Lizhe ^{[1
]}

Zhang, Wenqiang ^{[1
]}

Jiang, Zhe ^{[3
]}

机构：

[1] Fudan Univ, Acad Engn & Technol, Shanghai 200433, Peoples R China

[2] Shanghai Univ Int Business & Econ, Sch Stat & Informat, Shanghai 201620, Peoples R China

[3] Univ Cambridge, Comp Sci, Cambridge CB3 0FD, England

来源：

ELECTRONICS | 2022年 / 11卷 / 04期

关键词：

DNN accelerators; multi-branch network; energy-efficient accelerators;

D O I：

10.3390/electronics11040505

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep neural networks (DNNs) are widely used in many artificial intelligence applications; many specialized DNN-inference accelerators have been proposed. However, existing DNN accelerators rely heavily on certain types of DNN operations (such as Conv, FC, and ReLU, etc.), which are either less used or likely to become out of date in future, posing challenges of flexibility and compatibility to existing work. This paper designs a flexible DNN accelerator from a more generic perspective rather than speeding up certain types of DNN operations. Our proposed Nebula exploits the width property of DNNs and gains a significant improvement in system throughput and energy efficiency over multi-branch architectures. Nebula is a first-of-its-kind framework for multi-branch DNNs.

引用

页数：13

共 33 条

[1] Bellaouar A., 2012, LOW POWER DIGITAL VL
[2] DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning
Chen, Tianshi
Du, Zidong
Sun, Ninghui
Wang, Jia
Wu, Chengyong
Chen, Yunji
Temam, Olivier
[J]. ACM SIGPLAN NOTICES, 2014, 49 (04) : 269 - 283
[3] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
Chen, Yu-Hsin
Krishna, Tushar
Emer, Joel S.
Sze, Vivienne
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138
[4] Cong Jason, 2014, Artificial Neural Networks and Machine Learning - ICANN 2014. 24th International Conference on Artificial Neural Networks. Proceedings: LNCS 8681, P281, DOI 10.1007/978-3-319-11179-7_36
[5] Designing efficient accelerator of depthwise separable convolutional neural network on FPGA
Ding, Wei
Huang, Zeyu
Huang, Zunkai
Tian, Li
Wang, Hui
Feng, Songlin
[J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 97 : 278 - 286
[6] Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1
[7] Ham T.J., P 2020 IEEE INT S HI
[8] Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1026 - 1034
[9] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[10] Jiang Z., P 2020 IEEE REAL TIM, P38

← 1 2 3 4 →