Low-Bit Quantization of Neural Network Based on Exponential Moving Average Knowledge Distillation

被引：0

作者：

Lü J. ^{[1
,2
]}

Xu K. ^{[1
,2
]}

Wang D. ^{[1
,2
]}

机构：

[1] Institute of Information Science, Beijing Jiaotong University, Beijing

[2] Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing Jiaotong University, Beijing

来源：

Wang, Dong (wangdong@bjtu.edu.cn); Wang, Dong (wangdong@bjtu.edu.cn) | 1600年 / Science Press卷 / 34期

基金：

北京市自然科学基金;

关键词：

Deep Learning; Knowledge Distillation; Model Compression; Network Quantization;

D O I：

10.16451/j.cnki.issn1003-6059.202112007

中图分类号：

学科分类号：

摘要：

Now the memory and computational cost restrict the popularization of deep neural network application, whereas neural network quantization is an effective compression method. As the number of quantized bits is lower, the classification accuracy of neural networks becomes poorer in low-bit quantization of neural networks. To solve this problem, a low-bit quantization method of neural networks based on knowledge distillation is proposed. Firstly, a few images are exploited for adaptive initialization to train the quantization step of activation and weight to speed up the convergence of the quantization network. Then, the idea of exponential moving average knowledge distillation is introduced to normalize distillation loss and task loss and guide the training of quantization network. Experiments on ImageNet and CIFAR-10 datasets show that the performance of the proposed method is close to or better than that of the full precision network. © 2021, Science Press. All right reserved.

引用

页码：1143 / 1151

页数：8

共 29 条

[11]

JAIN S R, GURAL A, WU M, Et al., Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks

[12]

ZHOU A J, YAO A B, GUO Y W, Et al., Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights

[13]

ZHANG D Q, YANG J L, YE D Q Z, Et al., LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks, Proc of the European Conference on Computer Vision, pp. 373-390, (2018)

[14]

GAO M Y, SHEN Y J, LI Q Q, Et al., Residual Knowledge Distillation

[15]

NOWAK T S, CORSO J J., Deep Net Triage: Analyzing the Importance of Network Layers via Structural Compression

[16]

POLINO A, PASCANU R, ALISTARH D., Model Compression via Distillation and Quantization

[17]

WEI Y, PAN X Y, QIN H W, Et al., Quantization Mimic: Towards Very Tiny CNN for Object Detection, Proc of the European Conference on Computer Vision, pp. 274-290, (2018)

[18]

MISHRA A, MARR D., Apprentice: Using Knowledge Distillation Techniques to Improve Low-Precision Network Accuracy

[19]

BENGIO Y, LEONARD N, COURVILLE A., Estimating or Propagating Gradients through Stochastic Neurons for Conditional Computation

[20]

KRIZHEVSKY A., Learning Multiple Layers of Features from Tiny Images

← 1 2 3 →