Networks-on-Chip based Deep Neural Networks Accelerators for IoT Edge Devices

被引:6
作者
Ascia, Giuseppe [1 ]
Catania, Vincenzo [1 ]
Monteleone, Salvatore [1 ]
Palesi, Maurizio [1 ]
Patti, Davide [1 ]
Jose, John [2 ]
机构
[1] Univ Catania, Catania, Italy
[2] Indian Inst Technol, Gauhati, India
来源
2019 SIXTH INTERNATIONAL CONFERENCE ON INTERNET OF THINGS: SYSTEMS, MANAGEMENT AND SECURITY (IOTSMS) | 2019年
关键词
Deep Neural Network accelerator; Network-on-Chip; Performance evaluation; Energy analysis; Design space exploration; IoT edge devices;
D O I
10.1109/iotsms48152.2019.8939236
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The need for performing deep neural network inferences on resource-constrained embedded devices (e.g., Internet of Things nodes) requires specialized architectures to achieve the best trade-off among performance, energy, and cost. One of the most promising architectures in this context is based on massive parallel and specialized cores interconnected by means of a Network-on-Chip (NoC). In this paper, we extensively evaluate NoC-based deep neural network accelerators by exploring the design space spanned by several architectural parameters including, network size, routing algorithm, local memory size, link width, and number of memory interfaces. We show how latency is mainly dominated by the on-chip communication whereas energy consumption is mainly accounted by memory (both on-chip and off-chip). The outcome of the analysis, thus, pushes toward a research line devoted to the optimization of the on-chip communication fabric and the memory subsystem for performance improvement and energy efficiency, respectively.
引用
收藏
页码:227 / 234
页数:8
相关论文
共 15 条
[1]   True North: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip [J].
Akopyan, Filipp ;
Sawada, Jun ;
Cassidy, Andrew ;
Alvarez-Icaza, Rodrigo ;
Arthur, John ;
Merolla, Paul ;
Imam, Nabil ;
Nakamura, Yutaka ;
Datta, Pallab ;
Nam, Gi-Joon ;
Taba, Brian ;
Beakes, Michael ;
Brezzo, Bernard ;
Kuang, Jente B. ;
Manohar, Rajit ;
Risk, William P. ;
Jackson, Bryan ;
Modha, Dharmendra S. .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2015, 34 (10) :1537-1557
[2]   Scalable Hierarchical Network-on-Chip Architecture for Spiking Neural Network Hardware Implementations [J].
Carrillo, Snaider ;
Harkin, Jim ;
McDaid, Liam J. ;
Morgan, Fearghal ;
Pande, Sandeep ;
Cawley, Seamus ;
McGinley, Brian .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2013, 24 (12) :2451-2461
[3]   Cycle-Accurate Network on Chip Simulation with Noxim [J].
Catania, Vincenzo ;
Mineo, Andrea ;
Monteleone, Salvatore ;
Palesi, Maurizio ;
Patti, Davide .
ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION, 2016, 27 (01)
[4]   Origami: A 803-GOp/s/W Convolutional Network Accelerator [J].
Cavigelli, Lukas ;
Benini, Luca .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (11) :2461-2475
[5]   DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning [J].
Chen, Tianshi ;
Du, Zidong ;
Sun, Ninghui ;
Wang, Jia ;
Wu, Chengyong ;
Chen, Yunji ;
Temam, Olivier .
ACM SIGPLAN NOTICES, 2014, 49 (04) :269-283
[6]   The odd-even turn model for adaptive routing [J].
Chiu, GM .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2000, 11 (07) :729-738
[7]   XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference [J].
Conti, Francesco ;
Schiavone, Pasquale Davide ;
Benini, Luca .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (11) :2940-2951
[8]  
I. NanGate, 2008, NanGate 45nm open cell library
[9]   ImageNet Classification with Deep Convolutional Neural Networks [J].
Krizhevsky, Alex ;
Sutskever, Ilya ;
Hinton, Geoffrey E. .
COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90
[10]  
Kwon H., 2017, IEEE ACM INT S NETW