LSFQ: A Low-Bit Full Integer Quantization for High-Performance FPGA-Based CNN Acceleration

被引：5

作者：

Bao, Zhenshan ^{[1
]}

Fu, Guohang ^{[1
]}

Zhang, Wenbo ^{[1
]}

Zhan, Kang ^{[1
]}

Guo, Junnan ^{[1
]}

机构：

[1] Beijing Univ Technol, Beijing, Peoples R China

来源：

IEEE MICRO | 2022年 / 42卷 / 02期

关键词：

Quantization (signal); Convolutional neural networks; Design automation; Computer architecture; Field programmable gate arrays; Training data; Neural networks; Accelerator architectures;

D O I：

10.1109/MM.2021.3134968

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The effective implementation of quantization depends not only on the specific task but also on the hardware resources. This article presents a hardware-aware customized quantization method for convolutional neural networks. We propose a learnable parameter soft clipping full integer quantization (LSFQ), which includes weight and activation quantization with the learnable clipping parameters. Moreover, the LSFQ accelerator architecture is customized on the field-programmable gate array (FPGA) platform to verify the hardware awareness of our method, in which DSP48E2 is designed to realize the parallel computation of six low-bit integer multiplications. The results showed that the accuracy loss of LSFQ is less than 1% compared with the full-precision models including VGG7, mobile-net v2 in CIFAR10, and CIFAR100. An LSFQ accelerator was demonstrated at the 57th IEEE/ACM Design Automation Conference System Design Contest (DAC-SDC) and won the championship at the FPGA track.

引用

页码：8 / 15

页数：8

共 12 条

[1]

[Anonymous], 2013, CoRR abs/1308.3432

[2]

Bablani D., 2020, ICLR, P1

[3] FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks [J].

Blott, Michaela ;

Preusser, Thomas B. ;

Fraser, Nicholas J. ;

Gambardella, Giulio ;

O'Brien, Kenneth ;

Umuroglu, Yaman ;

Leeser, Miriam ;

Vissers, Kees .

ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2018, 11 (03)

[4]

Choi J., 2018, IEEE C COMP VIS PATT

[5]

Gong R., 2019, PROC IEEE INT C COMP, P1

[6] Ristretto: A Framework for Empirical Study of Resource-Efficient Inference in Convolutional Neural Networks [J].

Gysel, Philipp ;

Pimentel, Jon ;

Motamedi, Mohammad ;

Ghiasi, Soheil .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (11) :5784-5789

[7]

Jiao Li., 2017, Field Programmable Logic and Applications (FPL), 2017 27th International Conference on, P1

[8]

Louizos C., 2019, P 7 INT C LEARN REPR

[9] XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks [J].

Rastegari, Mohammad ;

Ordonez, Vicente ;

Redmon, Joseph ;

Farhadi, Ali .

COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :525-542

[10]

Shayer O, 2018, P INT C LEARN REPR, P1

← 1 2 →