An Energy-Efficient Precision-Scalable ConvNet Processor in 40-nm CMOS

被引:141
作者
Moons, Bert [1 ]
Verhelst, Marian [1 ]
机构
[1] Katholieke Univ Leuven, ESAT MICAS, Dept Elect Engn, B-3001 Leuven, Belgium
基金
比利时弗兰德研究基金会;
关键词
Approximate computing; ConvNet; convolutional neural network (CNN); deep learning; Dynamic-Voltage-Accuracy-Scaling; processor architecture; voltage scaling;
D O I
10.1109/JSSC.2016.2636225
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A precision-scalable processor for low-power ConvNets or convolutional neural networks is implemented in a 40-nm CMOS technology. To minimize energy consumption while maintaining throughput, this paper is the first to implement dynamic precision and energy scaling and exploit the sparsity of convolutions in a dedicated processor architecture. The processor's 256 parallel processing units achieve a peak 102 GOPS running at 204 MHz and 1.1 V. It is fully C-programmable through a custom generated compiler and consumes 25-287 mW at 204 MHz and a scaling efficiency between 0.3 and 2.7 effective TOPS/W. It achieves 47 frames/s on the convolutional layers of the AlexNet benchmark, consuming only 76 mW. This system hereby outperforms the state-of-the-art up to five times in energy efficiency.
引用
收藏
页码:903 / 914
页数:12
相关论文
共 43 条
[1]  
Abdel-Hamid O, 2012, INT CONF ACOUST SPEE, P4277, DOI 10.1109/ICASSP.2012.6288864
[2]   Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing [J].
Albericio, Jorge ;
Judd, Patrick ;
Hetherington, Tayler ;
Aamodt, Tor ;
Jerger, Natalie Enright ;
Moshovos, Andreas .
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :1-13
[3]  
Andersen TM, 2014, ISSCC DIG TECH PAP I, V57, P90, DOI 10.1109/ISSCC.2014.6757351
[4]  
[Anonymous], 2016, ARXIV160202830
[5]  
[Anonymous], 2015, Resiliency of deep neural networks under quantization
[6]  
[Anonymous], 2016, 2016 IEEE Winter Conference on Applications of Computer Vision
[7]  
Benini L., 2015, P 25 ED GREAT LAK S, P199, DOI 10.1145/2742060.2743766
[8]   A dynamic voltage scaled microprocessor system [J].
Burd, TD ;
Pering, TA ;
Stratakos, AJ ;
Brodersen, RW .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2000, 35 (11) :1571-1580
[9]   Accelerating Real-Time Embedded Scene Labeling with Convolutional Networks [J].
Cavigelli, Lukas ;
Magno, Michele ;
Benini, Luca .
2015 52ND ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2015,
[10]  
Chen YH, 2016, ISSCC DIG TECH PAP I, V59, P262, DOI 10.1109/ISSCC.2016.7418007