Review of neural network model acceleration techniques based on FPGA platforms

被引：4

作者：

Liu, Fang ^{[1
,2
]}

Li, Heyuan ^{[3
]}

Hu, Wei ^{[3
]}

He, Yanxiang ^{[1
]}

机构：

[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China

[2] Wuhan Vocat Coll Software & Engn, Wuhan, Peoples R China

[3] Wuhan Univ Sci & Technol, Coll Comp Sci, Wuhan, Peoples R China

来源：

NEUROCOMPUTING | 2024年 / 610卷

基金：

中国国家自然科学基金;

关键词：

Neural network model; FPGA; Algorithm hardware collaboration; Acceleration and optimization; DESIGN-SPACE EXPLORATION; POWER ESTIMATION; HIGH-PERFORMANCE; ARCHITECTURE; HARDWARE; IMPLEMENTATION; SEARCH; CNN;

D O I：

10.1016/j.neucom.2024.128511

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural network models, celebrated for their outstanding scalability and computational capabilities, have demonstrated remarkable performance across various fields such as vision, language, and multimodality. The rapid advancements in neural networks, fueled by the deep development of Internet technology and the increasing demand for intelligent edge devices, introduce new challenges, including significant model parameter sizes and increased storage pressures. In this context, Field-Programmable Gate Arrays (FPGA) emerge as a preferred platform for accelerating neural network models, thanks to their exceptional performance, energy efficiency, and the flexibility and scalability of the system. Building FPGA-based neural network systems necessitates bridging significant differences in objectives, methods, and design spaces between model design and hardware design. This review article adopts a comprehensive analytical framework to thoroughly explore multidimensional technological implementation strategies, encompassing optimizations at the algorithmic and hardware levels, as well as compiler optimization techniques. It focuses on methods for collaborative optimization between algorithms and hardware, identifies challenges in the collaborative design process, and proposes corresponding implementation strategies and key steps. Addressing various technological dimensions, the article provides in-depth technical analysis and discussion, aiming to offer valuable insights for research on optimizing and accelerating neural network models in edge computing environments.

引用

页数：36

共 185 条

[1] APNAS: Accuracy-and-Performance-Aware Neural Architecture Search for Neural Hardware Accelerators
Achararit, Paniti
Hanif, Muhammad Abdullah
Putra, Rachmad Vidya Wicaksana
Shafique, Muhammad
Hara-Azumi, Yuko
[J]. IEEE ACCESS, 2020, 8 : 165319 - 165334
[2] NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps
Aimar, Alessandro
Mostafa, Hesham
Calabrese, Enrico
Rios-Navarro, Antonio
Tapiador-Morales, Ricardo
Lungu, Iulia-Alexandra
Milde, Moritz B.
Corradi, Federico
Linares-Barranco, Alejandro
Liu, Shih-Chii
Delbruck, Tobi
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (03) : 644 - 656
[3] SnaPEA: Predictive Early Activation for Reducing Computation in Deep Convolutional Neural Networks
Akhlaghi, Vahideh
Yazdanbakhsh, Amir
Samadi, Kambiz
Gupta, Rajesh K.
Esmaeilzadeh, Hadi
[J]. 2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2018, : 662 - 673
[4] Data Reorganization in Memory Using 3D-stacked DRAM
Akin, Berkin
Franchetti, Franz
Hoe, James C.
[J]. 2015 ACM/IEEE 42ND ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2015, : 131 - 143
[5] Bit-Pragmatic Deep Neural Network Computing
Albericio, Jorge
Delmas, Alberto
Judd, Patrick
Sharify, Sayeh
O'Leary, Gerard
Genov, Roman
Moshovos, Andreas
[J]. 50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, : 382 - 394
[6] Alwani M, 2016, INT SYMP MICROARCH
[7] Optimization of Convolutional Neural Networks on Resource Constrained Devices
Arish, S.
Sinha, Sharad
Smitha, K. G.
[J]. 2019 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2019), 2019, : 19 - 24
[8] An OpenCLTM Deep Learning Accelerator on Arria 10
Aydonat, Utku
O'Connell, Shane
Capalija, Davor
Ling, Andrew C.
Chiu, Gordon R.
[J]. FPGA'17: PROCEEDINGS OF THE 2017 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, 2017, : 55 - 64
[9] Ba LJ, 2014, ADV NEUR IN, V27
[10] JHDL - An HDL for reconfigurable systems
Bellows, P
Hutchings, B
[J]. IEEE SYMPOSIUM ON FPGAS FOR CUSTOM COMPUTING MACHINES, PROCEEDINGS, 1998, : 175 - 184

← 1 2 3 4 5 6 7 8 9 10 →