LSFQ: A Low-Bit Full Integer Quantization for High-Performance FPGA-Based CNN Acceleration

被引:5
作者
Bao, Zhenshan [1 ]
Fu, Guohang [1 ]
Zhang, Wenbo [1 ]
Zhan, Kang [1 ]
Guo, Junnan [1 ]
机构
[1] Beijing Univ Technol, Beijing, Peoples R China
关键词
Quantization (signal); Convolutional neural networks; Design automation; Computer architecture; Field programmable gate arrays; Training data; Neural networks; Accelerator architectures;
D O I
10.1109/MM.2021.3134968
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The effective implementation of quantization depends not only on the specific task but also on the hardware resources. This article presents a hardware-aware customized quantization method for convolutional neural networks. We propose a learnable parameter soft clipping full integer quantization (LSFQ), which includes weight and activation quantization with the learnable clipping parameters. Moreover, the LSFQ accelerator architecture is customized on the field-programmable gate array (FPGA) platform to verify the hardware awareness of our method, in which DSP48E2 is designed to realize the parallel computation of six low-bit integer multiplications. The results showed that the accuracy loss of LSFQ is less than 1% compared with the full-precision models including VGG7, mobile-net v2 in CIFAR10, and CIFAR100. An LSFQ accelerator was demonstrated at the 57th IEEE/ACM Design Automation Conference System Design Contest (DAC-SDC) and won the championship at the FPGA track.
引用
收藏
页码:8 / 15
页数:8
相关论文
共 12 条
[1]  
[Anonymous], 2013, CoRR abs/1308.3432
[2]  
Bablani D., 2020, ICLR, P1
[3]   FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks [J].
Blott, Michaela ;
Preusser, Thomas B. ;
Fraser, Nicholas J. ;
Gambardella, Giulio ;
O'Brien, Kenneth ;
Umuroglu, Yaman ;
Leeser, Miriam ;
Vissers, Kees .
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2018, 11 (03)
[4]  
Choi J., 2018, IEEE C COMP VIS PATT
[5]  
Gong R., 2019, PROC IEEE INT C COMP, P1
[6]   Ristretto: A Framework for Empirical Study of Resource-Efficient Inference in Convolutional Neural Networks [J].
Gysel, Philipp ;
Pimentel, Jon ;
Motamedi, Mohammad ;
Ghiasi, Soheil .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (11) :5784-5789
[7]  
Jiao Li., 2017, Field Programmable Logic and Applications (FPL), 2017 27th International Conference on, P1
[8]  
Louizos C., 2019, P 7 INT C LEARN REPR
[9]   XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks [J].
Rastegari, Mohammad ;
Ordonez, Vicente ;
Redmon, Joseph ;
Farhadi, Ali .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :525-542
[10]  
Shayer O, 2018, P INT C LEARN REPR, P1