Adaptive joint compression method for deep neural networks

被引:0
作者
Yao B. [1 ]
Peng X. [1 ]
Yu X. [1 ]
Liu L. [1 ]
Peng Y. [1 ]
机构
[1] Department of Test and Control Engineering, Harbin Institute of Technology, Harbin
来源
Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument | 2023年 / 44卷 / 05期
关键词
deep neural network; joint optimization; model compression; quantization; sparsity;
D O I
10.19650/j.cnki.cjsi.J2311004
中图分类号
学科分类号
摘要
Deep neural network compression methods with a single and fixed pattern are difficult to compress the model sufficiently due to the limitation of accuracy loss. As a result, the compressed model still needs to consume costly and limited storage resources when it is deployed, which is a significant barrier to its use in edge devices. To address this problem, this article proposes an adaptive joint compression method, which optimizes model structure and weight bit-width in parallel. Compared with the majority of existing combined compression methods, adequate fusion of sparsity and quantization methods is performed for joint compression training to reduce model parameter redundancy comprehensively. Meanwhile, the layer-wise adaptive sparse ratio and weight bit-width are designed to solve the sub-optimization problem of model accuracy and improve model accuracy loss due to the fixed compression ratio. Experimental results of VGG, ResNet, and MobileNet using the CIFAR-10 dataset show that the proposed method achieves 143.0 ×, 151.6 ×, and 19.7 × parameter compression ratios. The corresponding accuracy loss values are 1.3%, 2.4%, and 0.9%, respectively. In addition, compared with 12 typical compression methods, the proposed method reduces the consumption of hardware memory resources by 15. 3× ~ 148. 5×. In addition, the proposed method achieves maximum compression ratio of 284. 2× whilemaintaining accuracy loss within limited range of 1. 2% on the self-built remote sensing optical image dataset. © 2023 Science Press. All rights reserved.
引用
收藏
页码:21 / 32
页数:11
相关论文
共 32 条
[1]  
LIU ZH, SUN J D, WEN J T., Bearing fault diagnosis method based on multi-dimension compressed deep neural network, Journal of Electronic Measurement and Instrumentation, 36, 7, pp. 189-198, (2022)
[2]  
GAO H, TIAN Y L, XU F Y, Et al., Survey of deep learning model compression and acceleration, Journal of Software, 32, 1, pp. 68-92, (2021)
[3]  
GHIMIRE D, KIL D, KIM S H J E., A survey on efficient convolutional neural networks and hardware acceleration, Electronics, 11, 6, (2022)
[4]  
DENG L, LI G, HAN S, Et al., Model compression and hardware acceleration for neural networks: A comprehensive survey, Proceedings of the IEEE, 108, 4, pp. 485-532, (2020)
[5]  
PENG J SH, SUN L X, WANG K, Et al., ED-YOLO power inspection UAV obstacle avoidance target detection algorithm based on model compression, Chinese Journal of Scientific Instrument, 42, 10, pp. 161-170, (2021)
[6]  
ZANG ZH K, PANG W G, XIE W J, Et al., Deep learning for real-time applications: A survey, Journal of Software, 31, 9, pp. 2654-2677, (2020)
[7]  
HAN S, POOL J, TRAN J, Et al., Learning both weights and connections for efficient neural network, Proceedings of the 28th International Conference on Neural Information Processing Systems, 1, pp. 1135-1143, (2015)
[8]  
JACOB B, KLIGYS S, CHEN B, Et al., Quantization and training of neural networks for efficient integer-arithmetic-only inference, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2704-2713, (2018)
[9]  
HAN S, MAO H, DALLY W., Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding [C], International Conference on Learning Representation, (2016)
[10]  
YANG H, GUI S, ZHU Y, Et al., Automatic neural network compression by sparsity-quantization joint learning: A constrained optimization-based approach, Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition, pp. 2178-2188, (2020)