UNIQ: Uniform Noise Injection for Non-Uniform Quantization of Neural Networks

被引:75
作者
Baskin, Chaim [1 ]
Liss, Natan [1 ]
Schwartz, Eli [2 ]
Zheltonozhskii, Evgenii [1 ]
Giryes, Raja [2 ]
Bronstein, Alex M. [1 ]
Mendelson, Avi [1 ]
机构
[1] Technion, Dept Comp Sci, CS Taub Bldg, IL-3200003 Haifa, Israel
[2] Tel Aviv Univ, Sch Elect Engn, POB 39040, IL-6997801 Tel Aviv, Israel
来源
ACM TRANSACTIONS ON COMPUTER SYSTEMS | 2021年 / 37卷 / 1-4期
关键词
Deep learning; neural networks; quantization; efficient deep learning;
D O I
10.1145/3444943
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present a novel method for neural network quantization. Our method, named UNIQ, emulates a non-uniform k-quantile quantizer and adapts the model to perform well with quantized weights by injecting noise to the weights at training time. As a by-product of injecting noise to weights, we find that activations can also be quantized to as low as 8-bit with only a minor accuracy degradation. Our non-uniform quantization approach provides a novel alternative to the existing uniform quantization techniques for neural networks. We further propose a novel complexity metric of number of bit operations performed (BOPs), and we show that this metric has a linear relation with logic utilization and power. We suggest evaluating the trade-off of accuracy vs. complexity (BOPs). The proposed method, when evaluated on ResNet18/34/50 and MobileNet on ImageNet, outperforms the prior state of the art both in the low-complexity regime and the high accuracy regime. We demonstrate the practical applicability of this approach, by implementing our non-uniformly quantized CNN on FPGA.
引用
收藏
页码:1 / 4
页数:15
相关论文
共 34 条
[1]  
Anderson Alexander G., 2018, P INT C LEARN REPER
[2]  
[Anonymous], 2016, P INT C LEARN REPR I
[3]  
[Anonymous], 2016, CORR
[4]  
[Anonymous], 2015, 32 ICML
[5]  
[Anonymous], 2009, LEARNING MULTIPLE LA
[6]  
Blundell C., 2015, Weight Uncertainty in Neural Network, V32, P1613
[7]   Deep Learning with Low Precision by Half-wave Gaussian Quantization [J].
Cai, Zhaowei ;
He, Xiaodong ;
Sun, Jian ;
Vasconcelos, Nuno .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5406-5414
[8]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[9]   Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks [J].
Chen, Yu-Hsin ;
Krishna, Tushar ;
Emer, Joel S. ;
Sze, Vivienne .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) :127-138
[10]  
Dong Yinpeng, 2017, P BRIT MACH VIS BMVC