An efficient lightweight CNN acceleration architecture for edge computing based-on FPGA

被引：4

作者：

Wu, Ruidong ^{[1
]}

Liu, Bing ^{[1
]}

Fu, Ping ^{[1
]}

Chen, Haolin ^{[1
]}

机构：

[1] Harbin Inst Technol, Sch Elect & Informat Engn, Harbin 150001, Peoples R China

来源：

APPLIED INTELLIGENCE | 2023年 / 53卷 / 11期

基金：

中国国家自然科学基金;

关键词：

FPGA; CNN; Acceleration architecture; Efficient inference; CONVOLUTIONAL NEURAL-NETWORK; DEEP;

D O I：

10.1007/s10489-022-04251-3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the system performance, volume and power restriction requirements in edge computing, single chip based on Field Programmable Gate Array (FPGA), with the characteristics of parallel execution, flexible configuration and power efficiency, is more desirable for realizing Convolutional Neural Network (CNN) acceleration. However, implementing a lightweight CNN with limited on-chip resources while maintaining high computing efficiency and utilization is still a challenging task. To achieve efficient acceleration with single chip, we implement Network-on-Chip (NoC) based on Processing Element (PE) that consists of multiple node arrays. Moreover, the computing and memory efficiencies of PE are optimized with a sharing function and hybrid memory. To maximize resource utilization, a theoretical model is constructed to explore the parallel parameters and running cycles of each PE. In the experimental results of LeNet and MobileNet, resource utilization values of 83.61% and 95.28% are achieved, where the throughput values are 53.3 Giga Operations Per Second (GOPS) and 41.9 GOPS, respectively. Power measurements show that the power efficiency is optimized to 77.25 GOPS/W and 85.51 GOPS/W on our platform, which is sufficient to realize efficient inference for edge computing.

引用

页码：13867 / 13881

页数：15

共 31 条

[1] Review of deep learning: concepts, CNN architectures, challenges, applications, future directions
Alzubaidi, Laith
Zhang, Jinglan
Humaidi, Amjad J.
Al-Dujaili, Ayad
Duan, Ye
Al-Shamma, Omran
Santamaria, J.
Fadhel, Mohammed A.
Al-Amidie, Muthana
Farhan, Laith
[J]. JOURNAL OF BIG DATA, 2021, 8 (01)
[2] [Anonymous], 2017, P IEEE C COMP VIS PA
[3] A CNN Accelerator on FPGA Using Depthwise Separable Convolution
Bai, Lin
Zhao, Yiming
Huang, Xinming
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2018, 65 (10) : 1415 - 1419
[4] IoT Wearable Sensor and Deep Learning: An Integrated Approach for Personalized Human Activity Recognition in a Smart Home Environment
Bianchi, Valentina
Bassoli, Marco
Lombardo, Gianfranco
Fornacciari, Paolo
Mordonini, Monica
De Munari, Ilaria
[J]. IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (05): : 8553 - 8562
[5] Cong Jason, 2014, Artificial Neural Networks and Machine Learning - ICANN 2014. 24th International Conference on Artificial Neural Networks. Proceedings: LNCS 8681, P281, DOI 10.1007/978-3-319-11179-7_36
[6] Designing efficient accelerator of depthwise separable convolutional neural network on FPGA
Ding, Wei
Huang, Zeyu
Huang, Zunkai
Tian, Li
Wang, Hui
Feng, Songlin
[J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 97 : 278 - 286
[7] FPGA-Based Implementation of a Real-Time Object Recognition System Using Convolutional Neural Network
Gilan, Ali Azarmi
Emad, Mohammad
Alizadeh, Bijan
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2020, 67 (04) : 755 - 759
[8] FPGA-Based High-Throughput CNN Hardware Accelerator With High Computing Resource Utilization Ratio
Huang, Wenjin
Wu, Huangtao
Chen, Qingkun
Luo, Conghui
Zeng, Shihao
Li, Tianrui
Huang, Yihua
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 4069 - 4083
[9] Ioffe S, 2015, PR MACH LEARN RES, V37, P448
[10] Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Jacob, Benoit
Kligys, Skirmantas
Chen, Bo
Zhu, Menglong
Tang, Matthew
Howard, Andrew
Adam, Hartwig
Kalenichenko, Dmitry
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2704 - 2713

← 1 2 3 4 →