SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks

被引：76

作者：

Faraone, Julian ^{[1
]}

Fraser, Nicholas ^{[2
]}

Blott, Michaela ^{[2
]}

Leong, Philip H. W. ^{[1
]}

机构：

[1] Univ Sydney, Sydney, NSW, Australia

[2] Xilinx Res Labs, Dublin, Ireland

来源：

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年

关键词：

D O I：

10.1109/CVPR.2018.00452

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Inference for state-of-the-art deep neural networks is computationally expensive, making them difficult to deploy on constrained hardware environments. An efficient way to reduce this complexity is to quantize the weight parameters and/or activations during training by approximating their distributions with a limited entry codebook. For very low-precisions, such as binary or ternary networks with 1-8-bit activations, the information loss from quantization leads to significant accuracy degradation due to large gradient mismatches between the forward and backward functions. In this paper, we introduce a quantization method to reduce this loss by learning a symmetric code book for particular weight subgroups. These subgroups are determined based on their locality in the weight matrix, such that the hardware simplicity of the low-precision representations is preserved. Empirically, we show that symmetric quantization can substantially improve accuracy for networks with extremely low-precision weights and activations. We also demonstrate that this representation imposes minimal or no hardware implications to more coarse-grained approaches. Source code is available at http s ://www.github.com/julianfaraone/SYQ.

引用

页码：4300 / 4309

页数：10

共 31 条

[1]

[Anonymous], 2015, Adv. Neural Inform. Process. Syst.

[2]

Bengio Yoshua, 2013, CORR

[3] Deep Learning with Low Precision by Half-wave Gaussian Quantization [J].

Cai, Zhaowei ;

He, Xiaodong ;

Sun, Jian ;

Vasconcelos, Nuno .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5406-5414

[4]

Chen WL, 2015, PR MACH LEARN RES, V37, P2285

[5]

Collobert R., 2011, CoRR, Vabs/1103.0398

[6]

Courbariaux Matthieu, 2015, CoRR

[7]

Duan Y., 2017, IEEE C COMP VIS PATT

[8]

Faraone J., 2017, CORR

[9]

Fraser NJ, 2017, P 8 WORKSH 6 WORKSH, P25

[10]

Glorot X, 2010, P 13 INT C ART INT S, P249, DOI DOI 10.1109/LGRS.2016.2565705

← 1 2 3 4 →