Deploying and Optimizing Convolutional Neural Networks on Heterogeneous Architecture

被引：0

作者：

Jiang, Junning ^{[1
]}

Cai, Liang ^{[1
]}

Dong, Feng ^{[2
]}

Yu, Kehua ^{[2
]}

Chen, Ke ^{[2
]}

Qu, Wei ^{[2
]}

Jiang, Jianfei ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Sch Microelect, Shanghai 200240, Peoples R China

[2] Beijing iQIYI Sci & Technol Co Ltd, Shanghai, Peoples R China

来源：

2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ASIC (ASICON) | 2019年

关键词：

CNN; acceleration; FPGA; optimization; data flow; compute precision;

D O I：

10.1109/asicon47005.2019.8983456

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Deploying convolutional neural networks to hardware platform can accelerate the inference and is critical for the application of artificial intelligence. In this paper, we design an FPGA+CPU heterogeneous platform to accelerate CNNs. Dataflow optimizing, accelerator structure optimization and compute precision optimization are proposed to improve performance of the accelerating platform. Different ResNet and MobileNet networks are successfully deployed on the platform. By applying the proposed dataflow optimization and precision optimization, the performance improvement of inference is 3.25x on ResNet. By applying the accelerator structure optimization and precision optimization, the performance improvement of inference is 3.63x on MobileNet.

引用

页数：4

共 9 条

[1] DLA: Compiler and FPGA Overlay for Neural Network Inference Acceleration
Abdelfattah, Mohamed S.
Han, David
Bitar, Andrew
DiCecco, Roberto
O'Connell, Shane
Shanker, Nitika
Chu, Joseph
Prins, Ian
Fender, Joshua
Ling, Andrew C.
Chiu, Gordon R.
[J]. 2018 28TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2018, : 411 - 418
[2] An OpenCLTM Deep Learning Accelerator on Arria 10
Aydonat, Utku
O'Connell, Shane
Capalija, Davor
Ling, Andrew C.
Chiu, Gordon R.
[J]. FPGA'17: PROCEEDINGS OF THE 2017 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, 2017, : 55 - 64
[3] A CNN Accelerator on FPGA Using Depthwise Separable Convolution
Bai, Lin
Zhao, Yiming
Huang, Xinming
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2018, 65 (10) : 1415 - 1419
[4] Flexibility: FPGAs and CAD in Deep Learning Acceleration
Chiu, Gordon R.
Ling, Andrew C.
Capalija, Davor
Bitar, Andrew
Abdelfattah, Mohamed S.
[J]. PROCEEDINGS OF THE 2018 INTERNATIONAL SYMPOSIUM ON PHYSICAL DESIGN (ISPD'18), 2018, : 34 - 41
[5] Guo B, 2018, INFINITE-DIMENSIONAL DYNAMICAL SYSTEMS, VOL 2: ATTRACTORS AND METHODS, P1, DOI 10.1515/9783110587265
[6] Ma Yufei., 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL), P1, DOI DOI 10.1109/ISCAS.2017.8050344
[7] Ma Yufei, 2017, INT S FIELD PROGR GA, P45
[8] Maximizing CNN Accelerator Efficiency Through Resource Partitioning
Shen, Yongming
Ferdman, Michael
Milder, Peter
[J]. 44TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2017), 2017, : 535 - 547
[9] Zhao R., 2018, INT S FIELD PROGR GA, P285

← 1 →