Review of neural network model acceleration techniques based on FPGA platforms

被引:4
作者
Liu, Fang [1 ,2 ]
Li, Heyuan [3 ]
Hu, Wei [3 ]
He, Yanxiang [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China
[2] Wuhan Vocat Coll Software & Engn, Wuhan, Peoples R China
[3] Wuhan Univ Sci & Technol, Coll Comp Sci, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
Neural network model; FPGA; Algorithm hardware collaboration; Acceleration and optimization; DESIGN-SPACE EXPLORATION; POWER ESTIMATION; HIGH-PERFORMANCE; ARCHITECTURE; HARDWARE; IMPLEMENTATION; SEARCH; CNN;
D O I
10.1016/j.neucom.2024.128511
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural network models, celebrated for their outstanding scalability and computational capabilities, have demonstrated remarkable performance across various fields such as vision, language, and multimodality. The rapid advancements in neural networks, fueled by the deep development of Internet technology and the increasing demand for intelligent edge devices, introduce new challenges, including significant model parameter sizes and increased storage pressures. In this context, Field-Programmable Gate Arrays (FPGA) emerge as a preferred platform for accelerating neural network models, thanks to their exceptional performance, energy efficiency, and the flexibility and scalability of the system. Building FPGA-based neural network systems necessitates bridging significant differences in objectives, methods, and design spaces between model design and hardware design. This review article adopts a comprehensive analytical framework to thoroughly explore multidimensional technological implementation strategies, encompassing optimizations at the algorithmic and hardware levels, as well as compiler optimization techniques. It focuses on methods for collaborative optimization between algorithms and hardware, identifies challenges in the collaborative design process, and proposes corresponding implementation strategies and key steps. Addressing various technological dimensions, the article provides in-depth technical analysis and discussion, aiming to offer valuable insights for research on optimizing and accelerating neural network models in edge computing environments.
引用
收藏
页数:36
相关论文
共 185 条
  • [1] APNAS: Accuracy-and-Performance-Aware Neural Architecture Search for Neural Hardware Accelerators
    Achararit, Paniti
    Hanif, Muhammad Abdullah
    Putra, Rachmad Vidya Wicaksana
    Shafique, Muhammad
    Hara-Azumi, Yuko
    [J]. IEEE ACCESS, 2020, 8 : 165319 - 165334
  • [2] NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps
    Aimar, Alessandro
    Mostafa, Hesham
    Calabrese, Enrico
    Rios-Navarro, Antonio
    Tapiador-Morales, Ricardo
    Lungu, Iulia-Alexandra
    Milde, Moritz B.
    Corradi, Federico
    Linares-Barranco, Alejandro
    Liu, Shih-Chii
    Delbruck, Tobi
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (03) : 644 - 656
  • [3] SnaPEA: Predictive Early Activation for Reducing Computation in Deep Convolutional Neural Networks
    Akhlaghi, Vahideh
    Yazdanbakhsh, Amir
    Samadi, Kambiz
    Gupta, Rajesh K.
    Esmaeilzadeh, Hadi
    [J]. 2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2018, : 662 - 673
  • [4] Data Reorganization in Memory Using 3D-stacked DRAM
    Akin, Berkin
    Franchetti, Franz
    Hoe, James C.
    [J]. 2015 ACM/IEEE 42ND ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2015, : 131 - 143
  • [5] Bit-Pragmatic Deep Neural Network Computing
    Albericio, Jorge
    Delmas, Alberto
    Judd, Patrick
    Sharify, Sayeh
    O'Leary, Gerard
    Genov, Roman
    Moshovos, Andreas
    [J]. 50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, : 382 - 394
  • [6] Alwani M, 2016, INT SYMP MICROARCH
  • [7] Optimization of Convolutional Neural Networks on Resource Constrained Devices
    Arish, S.
    Sinha, Sharad
    Smitha, K. G.
    [J]. 2019 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2019), 2019, : 19 - 24
  • [8] An OpenCLTM Deep Learning Accelerator on Arria 10
    Aydonat, Utku
    O'Connell, Shane
    Capalija, Davor
    Ling, Andrew C.
    Chiu, Gordon R.
    [J]. FPGA'17: PROCEEDINGS OF THE 2017 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, 2017, : 55 - 64
  • [9] Ba LJ, 2014, ADV NEUR IN, V27
  • [10] JHDL - An HDL for reconfigurable systems
    Bellows, P
    Hutchings, B
    [J]. IEEE SYMPOSIUM ON FPGAS FOR CUSTOM COMPUTING MACHINES, PROCEEDINGS, 1998, : 175 - 184