An FPGA-Based Energy-Efficient Reconfigurable Convolutional Neural Network Accelerator for Object Recognition Applications

被引：50

作者：

Li, Jixuan ^{[1
]}

Un, Ka-Fai ^{[1
]}

Yu, Wei-Han ^{[1
]}

Mak, Pui-In ^{[1
]}

Martins, Rui P. ^{[1
]}

机构：

[1] Univ Macau, Fac Sci & Technol, State Key Lab Analog & Mixed Signal VLSI IME & DE, Macau, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS | 2021年 / 68卷 / 09期

关键词：

Frequency modulation; Kernel; Throughput; Parallel processing; Memory management; Field programmable gate arrays; Computational efficiency; Computation efficiency; convolutional neural network (CNN); FPGA; object recognition; reconfigurability; THROUGHPUT; CNN;

D O I：

10.1109/TCSII.2021.3095283

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The computational efficiency is the prime concern of a computation-intensive deep convolutional neural network (CNN). In this Brief, we report an FPGA-based computation-efficient reconfigurable CNN accelerator. It innovates in the utilization of a kernel partition technique to substantially reduce the repeated access to the input feature maps and the kernels. As a result, it balances the ability for parallel computing while consuming less system power. Experimental results prove that the proposed CNN accelerator achieves a peak throughput of 220.0 GOP/s with an energy efficiency of 22.9 GOPs/W at 151.4 frames/s for the AlexNet. It is also reconfigurable to process VGG-16 befitting complex object recognition.

引用

收藏

页码：3143 / 3147

页数：5

相关论文

共 22 条

[1] Optimizing Hardware Accelerated General Matrix-Matrix Multiplication for CNNs on FPGAs [J].

Ahmad, Afzal ;

Pasha, Muhammad Adeel .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2020, 67 (11) :2692-2696

[2] An Architecture to Accelerate Convolution in Deep Neural Networks [J].

Ardakani, Arash ;

Condo, Carlo ;

Ahmadi, Mehdi ;

Gross, Warren J. .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2018, 65 (04) :1349-1362

[3] A CNN Accelerator on FPGA Using Depthwise Separable Convolution [J].

Bai, Lin ;

Zhao, Yiming ;

Huang, Xinming .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2018, 65 (10) :1415-1419

[4] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks [J].

Chen, Yu-Hsin ;

Krishna, Tushar ;

Emer, Joel S. ;

Sze, Vivienne .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) :127-138

[5] A Throughput-Optimized Channel-Oriented Processing Element Array for Convolutional Neural Networks [J].

Chen, Yu-Xian ;

Ruan, Shanq-Jang .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (02) :752-756

[6] A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN for Object Detection [J].

Duy Thanh Nguyen ;

Tuan Nghia Nguyen ;

Kim, Hyun ;

Lee, Hyuk-Jae .

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2019, 27 (08) :1861-1873

[7] FPGA-Based Implementation of a Real-Time Object Recognition System Using Convolutional Neural Network [J].

Gilan, Ali Azarmi ;

Emad, Mohammad ;

Alizadeh, Bijan .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2020, 67 (04) :755-759

[8] A Programmable Heterogeneous Microprocessor Based on Bit-Scalable In-Memory Computing [J].

Jia, Hongyang ;

Valavi, Hossein ;

Tang, Yinqi ;

Zhang, Jintao ;

Verma, Naveen .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (09) :2609-2621

[9] Hardware/Software Co-Exploration of Neural Architectures [J].

Jiang, Weiwen ;

Yang, Lei ;

Sha, Edwin Hsing-Mean ;

Zhuge, Qingfeng ;

Gu, Shouzhen ;

Dasgupta, Sakyasingha ;

Shi, Yiyu ;

Hu, Jingtong .

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (12) :4805-4815

[10] Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference [J].

Jiang, Weiwen ;

Sha, Edwin H-M ;

Zhang, Xinyi ;

Yang, Lei ;

Zhuge, Qingfeng ;

Shi, Yiyu ;

Hu, Jingtong .

ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2019, 18 (05)