Sparseness Ratio Allocation and Neuron Re-pruning for Neural Networks Compression

被引:0
作者
Guo, Li [1 ]
Zhou, Dajiang [1 ]
Zhou, Jinjia [2 ,3 ,4 ]
Kimura, Shinji [1 ]
机构
[1] Waseda Univ, Grad Sch Informat Prod & Syst, Kitakyushu, Fukuoka, Japan
[2] Hosei Univ, Grad Sch Sci, Tokyo, Japan
[3] Hosei Univ, Engn, Tokyo, Japan
[4] JST, PRESTO, Tokyo, Japan
来源
2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) | 2018年
基金
日本学术振兴会;
关键词
Model compression; connection/neuron pruning; sparseness ratio allocation; neuron re-pruning;
D O I
10.1109/ISCAS.2018.8351094
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Convolutional neural networks (CNNs) are rapidly gaining popularity in artificial intelligence applications and employed in mobile devices. However, this is challenging because of the high computational complexity of CNNs and the limited hardware resource in mobile devices. To address this issue, compressing the CNN model is an efficient solution. This work presents a new framework of model compression, with the sparseness ratio allocation (SRA) and the neuron re-pruning (NRP). To achieve a higher overall spareness ratio, SRA is exploited to determine pruned weight percentage for each layer. NRP is performed after the usual weight pruning to further reduce the relative redundant neurons in the meanwhile of guaranteeing the accuracy. From experimental results, with a slight accuracy drop of 0.1%, the proposed framework achieves 149.3 x compression on lenet-5. The storage size can be reduced by about 50% relative to previous works. 8-45.2% computational energy and 11.5-48.2% memory traffic energy are saved.
引用
收藏
页数:5
相关论文
共 10 条
[1]  
[Anonymous], 2017, ARXIV PREPRINT ARXIV
[2]  
[Anonymous], 2015, Adv. Neural Inform. Process. Syst.
[3]  
Chen YH, 2016, ISSCC DIG TECH PAP I, V59, P262, DOI 10.1109/ISSCC.2016.7418007
[4]  
Han S, 2016, P 4 INT C LEARN REPR
[5]   EIE: Efficient Inference Engine on Compressed Deep Neural Network [J].
Han, Song ;
Liu, Xingyu ;
Mao, Huizi ;
Pu, Jing ;
Pedram, Ardavan ;
Horowitz, Mark A. ;
Dally, William J. .
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :243-254
[6]  
JIA Y, 2014, P 22 ACM INT C MULT, DOI [DOI 10.1145/2647868.2654889, 10.1145/2647868.2654889]
[7]   Deep learning [J].
LeCun, Yann ;
Bengio, Yoshua ;
Hinton, Geoffrey .
NATURE, 2015, 521 (7553) :436-444
[8]  
Molchanov P., 2017, PROC INT C LEARN REP, P1
[9]  
Wang SH, 2017, DES AUT TEST EUROPE, P1032, DOI 10.23919/DATE.2017.7927142
[10]   Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning [J].
Yang, Tien-Ju ;
Chen, Yu-Hsin ;
Sze, Vivienne .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6071-6079