An Adaptive Row-based Weight Reuse Scheme for FPGA Implementation of Convolutional Neural Networks

被引:0
作者
Je, Hyeonseung [1 ]
Duy Thanh Nguyen [1 ]
Lee, Kyujoong [2 ]
Lee, Hyuk-Jae [1 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul, South Korea
[2] Sunmoon Univ, Dept Elect Engn, Asan, South Korea
来源
2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC) | 2021年
关键词
FPGA; Convolutional neural networks; U-Net; Row-reuse scheme; Adaptive;
D O I
10.1109/ITC-CSCC52171.2021.9501490
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
There is an increasing need to implement the Convolutional Neural network (CNN) with an FPGA thanks to its design flexibility over an ASIC and low power consumption over a GPU. The size of the network and the resource of the target FPGA board should be considered to deploy the CNN Network successfully. However, previous works use the fixed dataflow which is not optimized for each layer. As a result, high on-chip buffer utilization and frequent memory access are required. The row-based weight reuse scheme is efficient in reducing input/output buffer size. However, it causes resource underutilization for layers with small feature maps size. This paper proposes an adaptive row reuse scheme by applying each level of row-reuse for each layer depending on its characteristic. Finally, the proposed design is implemented with a Xilinx KCU1500 board, and the accelerator achieves 994.74 GOPS of the throughput for U-Net. For general CNN implementation, the proposed scheme achieves 1080.9 GOPS when running VGG16 with 1.7 times less buffer size compared to previous works.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Implementation of FPGA-based Accelerator for Deep Neural Networks
    Tsai, Tsung-Han
    Ho, Yuan-Chen
    Sheu, Ming-Hwa
    2019 IEEE 22ND INTERNATIONAL SYMPOSIUM ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS & SYSTEMS (DDECS), 2019,
  • [32] Design and Implementation of Convolutional Neural Networks Accelerator Based on Multidie
    Song, Qingzeng
    Zhang, Jiabing
    Sun, Liankun
    Jin, Guanghao
    IEEE ACCESS, 2022, 10 : 91497 - 91508
  • [33] Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA
    Ma, Yufei
    Suda, Naveen
    Cao, Yu
    Seo, Jae-sun
    Vrudhula, Sarma
    2016 26TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2016,
  • [34] Approximate Multiply-Accumulate Array for Convolutional Neural Networks on FPGA
    Wang, Ziwei
    Trefzer, Martin A.
    Bale, Simon J.
    Tyrrell, Andy M.
    2019 14TH INTERNATIONAL SYMPOSIUM ON RECONFIGURABLE COMMUNICATION-CENTRIC SYSTEMS-ON-CHIP (RECOSOC 2019), 2019, : 35 - 42
  • [35] Architecture Design of Convolutional Neural Networks for Face Detection on an FPGA Platform
    Yu, Bin-Syh
    Tsao, Yu
    Yang, Shao-Wen
    Chen, Yen-Kuang
    Chien, Shao-Yi
    PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2018, : 88 - 93
  • [36] Optimized Compression for Implementing Convolutional Neural Networks on FPGA
    Zhang, Min
    Li, Linpeng
    Wang, Hai
    Liu, Yan
    Qin, Hongbo
    Zhao, Wei
    ELECTRONICS, 2019, 8 (03)
  • [37] Efficient Design of Pruned Convolutional Neural Networks on FPGA
    Vestias, Mario
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (05): : 531 - 544
  • [38] Efficient Design of Pruned Convolutional Neural Networks on FPGA
    Mário Véstias
    Journal of Signal Processing Systems, 2021, 93 : 531 - 544
  • [39] FPGA-Based Memristor Emulator Circuit for Binary Convolutional Neural Networks
    Tolba, Mohammed F.
    Halawani, Yasmin
    Saleh, Hani
    Mohammad, Baker
    Al-Qutayri, Mahmoud
    IEEE ACCESS, 2020, 8 : 117736 - 117745
  • [40] FPGA Based Hardware Implementation of Simple Dynamic Binary Neural Networks
    Aoki, Shunsuke
    Koyama, Seitaro
    Saito, Toshimichi
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT VII, 2018, 11307 : 647 - 655