Scaling for edge inference of deep neural networks

被引:328
作者
Xu, Xiaowei [1 ]
Ding, Yukun [1 ]
Hu, Sharon Xiaobo [1 ]
Niemier, Michael [1 ]
Cong, Jason [2 ]
Hu, Yu [3 ]
Shi, Yiyu [1 ]
机构
[1] Univ Notre Dame, Dept Comp Sci, Notre Dame, IN 46556 USA
[2] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90024 USA
[3] Huazhong Univ Sci & Technol, Sch Opt & Elect Informat, Wuhan, Hubei, Peoples R China
关键词
ENERGY;
D O I
10.1038/s41928-018-0059-3
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep neural networks offer considerable potential across a range of applications, from advanced manufacturing to autonomous cars. A clear trend in deep neural networks is the exponential growth of network size and the associated increases in computational complexity and memory consumption. However, the performance and energy efficiency of edge inference, in which the inference (the application of a trained network to new data) is performed locally on embedded platforms that have limited area and power budget, is bounded by technology scaling. Here we analyse recent data and show that there are increasing gaps between the computational complexity and energy efficiency required by data scientists and the hardware capacity made available by hardware architects. We then discuss various architecture and algorithm innovations that could help to bridge the gaps.
引用
收藏
页码:216 / 222
页数:7
相关论文
共 119 条
[81]  
Park SK, 2015, IEEE INT MEM WORKSH, P1
[82]  
Peters M. E., 2018, FPGA, DOI DOI 10.1145/2847263.2847265
[83]   Training andoperation of an integrated neuromorphic network based on metal-oxide memristors [J].
Prezioso, M. ;
Merrikh-Bayat, F. ;
Hoskins, B. D. ;
Adam, G. C. ;
Likharev, K. K. ;
Strukov, D. B. .
NATURE, 2015, 521 (7550) :61-64
[84]   A reconfigurable on-line learning spiking neuromorphic processor comprising 256 neurons and 128K synapses [J].
Qiao, Ning ;
Mostafa, Hesham ;
Corradi, Federico ;
Osswald, Marc ;
Stefanini, Fabio ;
Sumislawska, Dora ;
Indiveri, Giacomo .
FRONTIERS IN NEUROSCIENCE, 2015, 9
[85]   XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks [J].
Rastegari, Mohammad ;
Ordonez, Vicente ;
Redmon, Joseph ;
Farhadi, Ali .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :525-542
[86]   Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators [J].
Reagen, Brandon ;
Whatmough, Paul ;
Adolf, Robert ;
Rama, Saketh ;
Lee, Hyunkwang ;
Lee, Sae Kyu ;
Miguel Hernandez-Lobato, Jose ;
Wei, Gu-Yeon ;
Brooks, David .
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :267-278
[87]  
Rosenberg C., 2013, GOOGLE RES BLOG 0612
[88]  
Sermanet P., 2014, P INT C LEARN REPR
[89]   ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars [J].
Shafiee, Ali ;
Nag, Anirban ;
Muralimanohar, Naveen ;
Balasubramonian, Rajeev ;
Strachan, John Paul ;
Hu, Miao ;
Williams, R. Stanley ;
Srikumar, Vivek .
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :14-26
[90]   Mastering the game of Go with deep neural networks and tree search [J].
Silver, David ;
Huang, Aja ;
Maddison, Chris J. ;
Guez, Arthur ;
Sifre, Laurent ;
van den Driessche, George ;
Schrittwieser, Julian ;
Antonoglou, Ioannis ;
Panneershelvam, Veda ;
Lanctot, Marc ;
Dieleman, Sander ;
Grewe, Dominik ;
Nham, John ;
Kalchbrenner, Nal ;
Sutskever, Ilya ;
Lillicrap, Timothy ;
Leach, Madeleine ;
Kavukcuoglu, Koray ;
Graepel, Thore ;
Hassabis, Demis .
NATURE, 2016, 529 (7587) :484-+