Low-Bit Quantization of Neural Network Based on Exponential Moving Average Knowledge Distillation

被引:0
作者
Lü J. [1 ,2 ]
Xu K. [1 ,2 ]
Wang D. [1 ,2 ]
机构
[1] Institute of Information Science, Beijing Jiaotong University, Beijing
[2] Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing Jiaotong University, Beijing
来源
Wang, Dong (wangdong@bjtu.edu.cn); Wang, Dong (wangdong@bjtu.edu.cn) | 1600年 / Science Press卷 / 34期
基金
北京市自然科学基金;
关键词
Deep Learning; Knowledge Distillation; Model Compression; Network Quantization;
D O I
10.16451/j.cnki.issn1003-6059.202112007
中图分类号
学科分类号
摘要
Now the memory and computational cost restrict the popularization of deep neural network application, whereas neural network quantization is an effective compression method. As the number of quantized bits is lower, the classification accuracy of neural networks becomes poorer in low-bit quantization of neural networks. To solve this problem, a low-bit quantization method of neural networks based on knowledge distillation is proposed. Firstly, a few images are exploited for adaptive initialization to train the quantization step of activation and weight to speed up the convergence of the quantization network. Then, the idea of exponential moving average knowledge distillation is introduced to normalize distillation loss and task loss and guide the training of quantization network. Experiments on ImageNet and CIFAR-10 datasets show that the performance of the proposed method is close to or better than that of the full precision network. © 2021, Science Press. All right reserved.
引用
收藏
页码:1143 / 1151
页数:8
相关论文
共 29 条
[21]  
RUSSAKOVSKY O, DENG J, SU H, Et al., ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision, 115, 3, pp. 211-252, (2015)
[22]  
HOWARD A G, ZHU M L, CHEN B, Et al., MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
[23]  
SANDLER M, HOWARD A, ZHU M L, Et al., MobileNetV2: Inverted Residuals and Linear Bottlenecks, Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510-4520, (2018)
[24]  
REDMON J, FARHADI A., YOLO9000: Better, Faster, Stronger, Proc of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6517-6525, (2017)
[25]  
HE K M, ZHANG X Y, REN S Q, Et al., Deep Residual Learning for Image Recognition, Proc of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, (2016)
[26]  
KINGMA D P, BA J., Adam: A Method for Stochastic Optimization
[27]  
GONG R H, LIU X L, JIANG S H, Et al., Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks, Proc of the IEEE/CVF International Conference on Computer Vision, pp. 4851-4860, (2019)
[28]  
BHALGAT Y, LEE J, NAGEL M, Et al., LSQ+: Improving Low-Bit Quantization through Learnable Offsets and Better Initialization, Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 2978-2985, (2020)
[29]  
WANG K, LIU Z J, LIN Y J, Et al., HAQ: Hardware-Aware Automated Quantization with Mixed Precision, Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8604-8612, (2019)