Optimizing FPGA-Based Convolutional Neural Network Performance

被引:3
|
作者
Kao, Chi-Chou [1 ]
机构
[1] Natl Univ Tainan, Dept Comp Sci & Informat Engn, Tainan 700, Taiwan
关键词
CNN; FPGA; optimize; performance; architecture;
D O I
10.1142/S0218126623502547
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In deep learning, convolutional neural networks (CNNs) are a class of artificial neural networks (ANNs), most commonly applied to analyze visual imagery. They are also known as Shift-Invariant or Space-Invariant Artificial Neural Networks (SIANNs), based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation-equivariant responses known as feature maps. Recently, various architectures for CNN based on FPGA platform have been proposed because it has the advantages of high performance and fast development cycle. However, some key issues including how to optimize the performance of CNN layers with different structures, high-performance heterogeneous accelerator design, and how to reduce the neural network framework integration overhead need to be improved. To overcome and improve these problems, we propose dynamic cycle pipeline tiling, data layout optimization, and a pipelined software and hardware (SW-HW)-integrated architecture with flexibility and integration. Some benchmarks have been tested and implemented on the FPGA board for the proposed architecture. The proposed dynamic tiling and data layout transformation improved by 2.3 times in the performance. Moreover, with two-level pipelining, we achieve up to five times speedup and the proposed system is 3.8 times more energy-efficient than the GPU.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] An FPGA-based Accelerator Implementation for Deep Convolutional Neural Networks
    Zhou, Yongmei
    Jiang, Jingfei
    PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 829 - 832
  • [32] Optimizing a FPGA-based Neural Accelerator for Small IoT Devices
    Hong, Seongmin
    Lee, Inho
    Park, Yongjun
    2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2018, : 176 - 177
  • [33] An FPGA-Based Reconfigurable Convolutional Neural Network Accelerator for Tiny YOLO-V3
    Tsai, Tsung-Han
    Tung, Nai-Chieh
    Chen, Chun-Yu
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2025, : 3388 - 3409
  • [34] A hardware-efficient computing engine for FPGA-based deep convolutional neural network accelerator
    Li, Xueming
    Huang, Hongmin
    Chen, Taosheng
    Gao, Huaien
    Hu, Xianghong
    Xiong, Xiaoming
    MICROELECTRONICS JOURNAL, 2022, 128
  • [35] An FPGA-Based Performance Evaluation of Artificial Neural Network Architecture Algorithm for IoT
    Teodoro, Arthur A. M.
    Gomes, Otavio S. M.
    Saadi, Muhammad
    Silva, Bruno A.
    Rosa, Renata L.
    Rodriguez, Demostenes Z.
    WIRELESS PERSONAL COMMUNICATIONS, 2022, 127 (02) : 1085 - 1116
  • [36] An FPGA-Based Performance Evaluation of Artificial Neural Network Architecture Algorithm for IoT
    Arthur A. M. Teodoro
    Otávio S. M. Gomes
    Muhammad Saadi
    Bruno A. Silva
    Renata L. Rosa
    Demóstenes Z. Rodríguez
    Wireless Personal Communications, 2022, 127 : 1085 - 1116
  • [37] Optimizing Convolutional Neural Network Accelerator on Low-Cost FPGA
    Truong Quang Vinh
    Dinh Viet Hai
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2021, 30 (11)
  • [38] A High-Efficiency FPGA-Based Accelerator for Binarized Neural Network
    Guo, Peng
    Ma, Hong
    Chen, Ruizhi
    Wang, Donglin
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2019, 28
  • [39] A reconfigurable FPGA-based spiking neural network accelerator
    Yin, Mingqi
    Cui, Xiaole
    Wei, Feng
    Liu, Hanqing
    Jiang, Yuanyuan
    Cui, Xiaoxin
    MICROELECTRONICS JOURNAL, 2024, 152
  • [40] Optimizing FPGA-based Convolutional Encoder-Decoder Architecture for Semantic Segmentation
    Yu, Mengqi
    Huang, Hongzhi
    Liu, Hong
    He, Shuyi
    Qiao, Fei
    Luo, Li
    Xie, Fugui
    Liu, Xin-Jun
    Yang, Huazhong
    2019 9TH IEEE ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER 2019), 2019, : 1436 - 1440