An Adaptive Row-based Weight Reuse Scheme for FPGA Implementation of Convolutional Neural Networks

被引：0

作者：

Je, Hyeonseung ^{[1
]}

Duy Thanh Nguyen ^{[1
]}

Lee, Kyujoong ^{[2
]}

Lee, Hyuk-Jae ^{[1
]}

机构：

[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul, South Korea

[2] Sunmoon Univ, Dept Elect Engn, Asan, South Korea

来源：

2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC) | 2021年

关键词：

FPGA; Convolutional neural networks; U-Net; Row-reuse scheme; Adaptive;

D O I：

10.1109/ITC-CSCC52171.2021.9501490

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

There is an increasing need to implement the Convolutional Neural network (CNN) with an FPGA thanks to its design flexibility over an ASIC and low power consumption over a GPU. The size of the network and the resource of the target FPGA board should be considered to deploy the CNN Network successfully. However, previous works use the fixed dataflow which is not optimized for each layer. As a result, high on-chip buffer utilization and frequent memory access are required. The row-based weight reuse scheme is efficient in reducing input/output buffer size. However, it causes resource underutilization for layers with small feature maps size. This paper proposes an adaptive row reuse scheme by applying each level of row-reuse for each layer depending on its characteristic. Finally, the proposed design is implemented with a Xilinx KCU1500 board, and the accelerator achieves 994.74 GOPS of the throughput for U-Net. For general CNN implementation, the proposed scheme achieves 1080.9 GOPS when running VGG16 with 1.7 times less buffer size compared to previous works.

引用

页数：4

共 50 条

[31] Implementation of FPGA-based Accelerator for Deep Neural Networks
Tsai, Tsung-Han
Ho, Yuan-Chen
Sheu, Ming-Hwa
2019 IEEE 22ND INTERNATIONAL SYMPOSIUM ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS & SYSTEMS (DDECS), 2019,
[32] Design and Implementation of Convolutional Neural Networks Accelerator Based on Multidie
Song, Qingzeng
Zhang, Jiabing
Sun, Liankun
Jin, Guanghao
IEEE ACCESS, 2022, 10 : 91497 - 91508
[33] Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA
Ma, Yufei
Suda, Naveen
Cao, Yu
Seo, Jae-sun
Vrudhula, Sarma
2016 26TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2016,
[34] Approximate Multiply-Accumulate Array for Convolutional Neural Networks on FPGA
Wang, Ziwei
Trefzer, Martin A.
Bale, Simon J.
Tyrrell, Andy M.
2019 14TH INTERNATIONAL SYMPOSIUM ON RECONFIGURABLE COMMUNICATION-CENTRIC SYSTEMS-ON-CHIP (RECOSOC 2019), 2019, : 35 - 42
[35] Architecture Design of Convolutional Neural Networks for Face Detection on an FPGA Platform
Yu, Bin-Syh
Tsao, Yu
Yang, Shao-Wen
Chen, Yen-Kuang
Chien, Shao-Yi
PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2018, : 88 - 93
[36] Optimized Compression for Implementing Convolutional Neural Networks on FPGA
Zhang, Min
Li, Linpeng
Wang, Hai
Liu, Yan
Qin, Hongbo
Zhao, Wei
ELECTRONICS, 2019, 8 (03)
[37] Efficient Design of Pruned Convolutional Neural Networks on FPGA
Vestias, Mario
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (05): : 531 - 544
[38] Efficient Design of Pruned Convolutional Neural Networks on FPGA
Mário Véstias
Journal of Signal Processing Systems, 2021, 93 : 531 - 544
[39] FPGA-Based Memristor Emulator Circuit for Binary Convolutional Neural Networks
Tolba, Mohammed F.
Halawani, Yasmin
Saleh, Hani
Mohammad, Baker
Al-Qutayri, Mahmoud
IEEE ACCESS, 2020, 8 : 117736 - 117745
[40] FPGA Based Hardware Implementation of Simple Dynamic Binary Neural Networks
Aoki, Shunsuke
Koyama, Seitaro
Saito, Toshimichi
NEURAL INFORMATION PROCESSING (ICONIP 2018), PT VII, 2018, 11307 : 647 - 655

← 1 2 3 4 5 →