UNIQ: Uniform Noise Injection for Non-Uniform Quantization of Neural Networks

被引:72
作者
Baskin, Chaim [1 ]
Liss, Natan [1 ]
Schwartz, Eli [2 ]
Zheltonozhskii, Evgenii [1 ]
Giryes, Raja [2 ]
Bronstein, Alex M. [1 ]
Mendelson, Avi [1 ]
机构
[1] Technion, Dept Comp Sci, CS Taub Bldg, IL-3200003 Haifa, Israel
[2] Tel Aviv Univ, Sch Elect Engn, POB 39040, IL-6997801 Tel Aviv, Israel
来源
ACM TRANSACTIONS ON COMPUTER SYSTEMS | 2021年 / 37卷 / 1-4期
关键词
Deep learning; neural networks; quantization; efficient deep learning;
D O I
10.1145/3444943
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present a novel method for neural network quantization. Our method, named UNIQ, emulates a non-uniform k-quantile quantizer and adapts the model to perform well with quantized weights by injecting noise to the weights at training time. As a by-product of injecting noise to weights, we find that activations can also be quantized to as low as 8-bit with only a minor accuracy degradation. Our non-uniform quantization approach provides a novel alternative to the existing uniform quantization techniques for neural networks. We further propose a novel complexity metric of number of bit operations performed (BOPs), and we show that this metric has a linear relation with logic utilization and power. We suggest evaluating the trade-off of accuracy vs. complexity (BOPs). The proposed method, when evaluated on ResNet18/34/50 and MobileNet on ImageNet, outperforms the prior state of the art both in the low-complexity regime and the high accuracy regime. We demonstrate the practical applicability of this approach, by implementing our non-uniformly quantized CNN on FPGA.
引用
收藏
页码:1 / 4
页数:15
相关论文
共 34 条
[21]  
Louizos C., 2017, P 31 C NEUR INF PROC, P3288
[22]  
Mishra Asit, 2018, P INT C LEARN REPR I
[23]  
Molchanov Dmitry, 2017, P INT C MACH LEARN
[24]  
Park Eunhyeok, 2018, P EUR C COMP VIS ECC
[25]  
Polino Antonio, 2018, P INT C LEARN REPR I
[26]   XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks [J].
Rastegari, Mohammad ;
Ordonez, Vicente ;
Redmon, Joseph ;
Farhadi, Ali .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :525-542
[27]   ImageNet Large Scale Visual Recognition Challenge [J].
Russakovsky, Olga ;
Deng, Jia ;
Su, Hao ;
Krause, Jonathan ;
Satheesh, Sanjeev ;
Ma, Sean ;
Huang, Zhiheng ;
Karpathy, Andrej ;
Khosla, Aditya ;
Bernstein, Michael ;
Berg, Alexander C. ;
Fei-Fei, Li .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 115 (03) :211-252
[28]   Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks [J].
Sharma, Hardik ;
Park, Jongse ;
Suda, Naveen ;
Lai, Liangzhen ;
Chau, Benson ;
Chandra, Vikas ;
Esmaeilzadeh, Hadi .
2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2018, :764-775
[29]   A Quantization-Friendly Separable Convolution for MobileNets [J].
Sheng, Tao ;
Feng, Chen ;
Zhuo, Shaojie ;
Zhang, Xiaopeng ;
Shen, Liang ;
Aleksic, Mickey .
2018 1ST WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING FOR EMBEDDED APPLICATIONS (EMC2), 2018, :14-18
[30]  
Ullrich Karen, 2017, P INT C LEARN REPR I