Design of FPGA-Based Accelerator for Convolutional Neural Network under Heterogeneous Computing Framework with OpenCL

被引：9

作者：

Luo, Li ^{[1
]}

Wu, Yakun ^{[1
]}

Qiao, Fei ^{[2
]}

Yang, Yi ^{[2
]}

Wei, Qi ^{[2
]}

Zhou, Xiaobo ^{[1
]}

Fan, Yongkai ^{[3
]}

Xu, Shuzheng ^{[2
]}

Liu, Xinjun ^{[4
]}

Yang, Huazhong ^{[2
]}

机构：

[1] Beijing Jiaotong Univ, Dept Elect Sci & Technol, Beijing, Peoples R China

[2] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China

[3] China Univ Petr, Beijing, Peoples R China

[4] Tsinghua Univ, Dept Mech Engn, Beijing, Peoples R China

来源：

INTERNATIONAL JOURNAL OF RECONFIGURABLE COMPUTING | 2018年 / 2018卷

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Compendex;

D O I：

10.1155/2018/1785892

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

CPU has insufficient resources to satisfy the efficient computation of the convolution neural network (CNN), especially for embedded applications. Therefore, heterogeneous computing platforms are widely used to accelerate CNN tasks, such as GPU, FPGA, and ASIC. Among these, FPGA can accelerate the computation by mapping the algorithm to the parallel hardware instead of CPU, which cannot fully exploit the parallelism. By fully using the parallelism of the neural network's structure, FPGA can reduce the computing costs and increase the computing speed. However, the development of FPGA requires great design skills. As a heterogeneous development platform, OpenCL has some advantages such as high abstraction level, short development cycle, and strong portability, which can make up for the lack of skilled designers. This paper uses Xilinx SDAccel to realize the parallel acceleration of CNN task, and it also proposes an optimizing strategy of single convolutional layer to accelerate CNN. Simulation results show that the calculation speed could be improved by adopting the proposed optimizing strategy. Compared with the baseline design, the strategy of single convolutional layer could increase the computing speed 14 times. Performance of the whole CNN task could be improved 2 times more than before, and the speed of image classification could attain more than 48 fps.

引用

页数：10

共 50 条

[1] Optimizing Convolutional Neural Network on FPGA under Heterogeneous Computing Framework with OpenCL
Wang, Zhengrong
Qiao, Fei
Liu, Zhen
Shan, Yuxiang
Zhou, Xunyi
Luo, Li
Yang, Huazhong
PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 3433 - 3438
[2] An FPGA-based Accelerator Platform Implements for Convolutional Neural Network
Meng, Xiao
Yu, Lixin
Qin, Zhiyong
2019 THE 3RD INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPILATION, COMPUTING AND COMMUNICATIONS (HP3C 2019), 2019, : 25 - 28
[3] Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network
Zhang, Jialiang
Li, Jing
FPGA'17: PROCEEDINGS OF THE 2017 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, 2017, : 25 - 34
[4] A hardware-efficient computing engine for FPGA-based deep convolutional neural network accelerator
Li, Xueming
Huang, Hongmin
Chen, Taosheng
Gao, Huaien
Hu, Xianghong
Xiong, Xiaoming
MICROELECTRONICS JOURNAL, 2022, 128
[5] FPGA-based Convolutional Neural Network Accelerator design using High Level Synthesize
Ghaffari, Sina
Sharifian, Saeed
2016 2ND INTERNATIONAL CONFERENCE OF SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2016, : 29 - 34
[6] FPGA-based Convolutional Neural Network Design and Implementation
Yan, Ruitao
Yi, Jianjun
He, Jie
Zhao, Yifan
2023 3RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE, ACCTCS, 2023, : 456 - 460
[7] FPGA-based Accelerator for Convolutional Neural Network Application in Mobile Robotics
Mazzetto, Lucas F. R.
Castanho, Jose E. C.
2023 LATIN AMERICAN ROBOTICS SYMPOSIUM, LARS, 2023 BRAZILIAN SYMPOSIUM ON ROBOTICS, SBR, AND 2023 WORKSHOP ON ROBOTICS IN EDUCATION, WRE, 2023, : 433 - 438
[8] A FPGA-based Accelerator of Convolutional Neural Network for Face Feature Extraction
Ding, Ru
Su, Guangda
Bai, Guoqiang
Xu, Wei
Su, Nan
Wu, Xingjun
2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRON DEVICES AND SOLID-STATE CIRCUITS (EDSSC), 2019,
[9] FPGA-Based Unified Accelerator for Convolutional Neural Network and Vision Transformer
Li T.
Zhang F.
Wang S.
Cao W.
Chen L.
Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (06): : 2663 - 2672
[10] FPGA-based Training Accelerator Utilizing Sparseness of Convolutional Neural Network
Nakahara, Hiroki
Sada, Youki
Shimoda, Masayuki
Sayama, Kouki
Jinguji, Akira
Sato, Shimpei
2019 29TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2019, : 180 - 186

← 1 2 3 4 5 →