FPGA-QNN: Quantized Neural Network Hardware Acceleration on FPGAs

被引:2
作者
Tasci, Mustafa [1 ]
Istanbullu, Ayhan [2 ]
Tumen, Vedat [3 ]
Kosunalp, Selahattin [1 ]
机构
[1] Bandirma Onyedi Eylul Univ, Gonen Vocat Sch, Dept Comp Technol, TR-10200 Bandirma, Turkiye
[2] Balikesir Univ, Fac Engn, Dept Comp Engn, TR-10145 Balikesir, Turkiye
[3] Bitlis Eren Univ, Fac Engn & Architecture, Dept Comp Engn, TR-13100 Bitlis, Turkiye
来源
APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 02期
关键词
accelerator; FPGA; QNN; deep learning; FINN; ARCHITECTURE; FRAMEWORK;
D O I
10.3390/app15020688
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Recently, convolutional neural networks (CNNs) have received a massive amount of interest due to their ability to achieve high accuracy in various artificial intelligence tasks. With the development of complex CNN models, a significant drawback is their high computational burden and memory requirements. The performance of a typical CNN model can be enhanced by the improvement of hardware accelerators. Practical implementations on field-programmable gate arrays (FPGA) have the potential to reduce resource utilization while maintaining low power consumption. Nevertheless, when implementing complex CNN models on FPGAs, these may may require further computational and memory capacities, exceeding the available capacity provided by many current FPGAs. An effective solution to this issue is to use quantized neural network (QNN) models to remove the burden of full-precision weights and activations. This article proposes an accelerator design framework for FPGAs, called FPGA-QNN, with a particular value in reducing high computational burden and memory requirements when implementing CNNs. To approach this goal, FPGA-QNN exploits the basics of quantized neural network (QNN) models by converting the high burden of full-precision weights and activations into integer operations. The FPGA-QNN framework comes up with 12 accelerators based on multi-layer perceptron (MLP) and LeNet CNN models, each of which is associated with a specific combination of quantization and folding. The outputs from the performance evaluations on Xilinx PYNQ Z1 development board proved the superiority of FPGA-QNN in terms of resource utilization and energy efficiency in comparison to several recent approaches. The proposed MLP model classified the FashionMNIST dataset at a speed of 953 kFPS with 1019 GOPs while consuming 2.05 W.
引用
收藏
页数:21
相关论文
共 41 条
[1]   YodaNN: An Architecture for Ultralow Power Binary-Weight CNN Acceleration [J].
Andri, Renzo ;
Cavigelli, Lukas ;
Rossi, Davide ;
Benini, Luca .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (01) :48-60
[2]   Pattern Classification Using Quantized Neural Networks for FPGA-Based Low-Power IoT Devices [J].
Biswal, Manas Ranjan ;
Delwar, Tahesin Samira ;
Siddique, Abrar ;
Behera, Prangyadarsini ;
Choi, Yeji ;
Ryu, Jee-Youl .
SENSORS, 2022, 22 (22)
[3]   FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks [J].
Blott, Michaela ;
Preusser, Thomas B. ;
Fraser, Nicholas J. ;
Gambardella, Giulio ;
O'Brien, Kenneth ;
Umuroglu, Yaman ;
Leeser, Miriam ;
Vissers, Kees .
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2018, 11 (03)
[4]   StereoEngine: An FPGA-Based Accelerator for Real-Time High-Quality Stereo Estimation With Binary Neural Network [J].
Chen, Gang ;
Ling, Yehua ;
He, Tao ;
Meng, Haitao ;
He, Shengyu ;
Zhang, Yu ;
Huang, Kai .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (11) :4179-4190
[5]   A Learning Framework for n-Bit Quantized Neural Networks Toward FPGAs [J].
Chen, Jun ;
Liu, Liang ;
Liu, Yong ;
Zeng, Xianfang .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (03) :1067-1081
[6]   FCA-BNN: Flexible and Configurable Accelerator for Binarized Neural Networks on FPGA [J].
Gao, Jiabao ;
Yao, Yuchen ;
Li, Zhengjie ;
Lai, Jinmei .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (08) :1367-1377
[7]  
Hubara I., 2016, P 30 C NEURAL INFORM, V1, P9
[8]  
Hubara I, 2018, J MACH LEARN RES, V18
[9]   Efficient CNN Architecture on FPGA Using High Level Module for Healthcare Devices [J].
Jameil, Ahmed K. ;
Al-Raweshidy, Hamed .
IEEE ACCESS, 2022, 10 :60486-60495
[10]  
Jokic P, 2018, INT SYM IND EMBED, P1