Design and Implementation of Convolutional Neural Networks Accelerator Based on Multidie

被引:5
作者
Song, Qingzeng [1 ]
Zhang, Jiabing [1 ]
Sun, Liankun [1 ]
Jin, Guanghao [2 ]
机构
[1] Tiangong Univ, Sch Comp Sci & Technol, Tianjin 300387, Peoples R China
[2] Beijing Polytech, Sch Telecommun Engn, Beijing 100176, Peoples R China
来源
IEEE ACCESS | 2022年 / 10卷
基金
中国国家自然科学基金;
关键词
Convolutional neural networks; Quantization (signal); Object detection; Field programmable gate arrays; Mathematical models; Hardware acceleration; Hardware accelerator; multi-die; object detection; YOLOv4-tiny;
D O I
10.1109/ACCESS.2022.3199441
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To achieve real-time object detection tasks with high throughput and low latency, this paper proposes a multi-die hardware accelerator architecture. It implements three accelerators on the VU9P chip, each of which is bound to an independent super logic region (SLR). To reduce off-chip memory access and power consumption, this design uses three on-chip buffers to store the weights and intermediate result data on one hand; on the other hand, it minimizes data access and movement and maximizes data reuse. This design uses an 8-bit quantization strategy for both weights and feature maps to achieve twice the throughput and computational efficiency of a single digital signal processor (DSP). In addition, many operators are designed in the accelerator, and all of them are fully parameterized, so it is easy to extend the network, and the control of the accelerator can be realized by configuring the instruction group. By accelerating the YOLOv4-tiny algorithm, the accelerator architecture can achieve a frame rate of 148.14 frames per second (FPS) and a peak throughput of 2.76 tera operations per second (TOPS) at 200 MHz with an energy efficiency ratio of 93.15 GOPS/W. The code can be found at https://github.com/19801201/Verilog_CNN_Accelerator.
引用
收藏
页码:91497 / 91508
页数:12
相关论文
共 22 条
  • [1] Low Latency YOLOv3-Tiny Accelerator for Low-Cost FPGA Using General Matrix Multiplication Principle
    Adiono, Trio
    Putra, Adiwena
    Sutisna, Nana
    Syafalni, Infall
    Mulyawan, Rahmat
    [J]. IEEE ACCESS, 2021, 9 : 141890 - 141913
  • [2] Convolutional neural network approach for automatic tympanic membrane detection and classification
    Basaran, Erdal
    Comert, Zafer
    Celik, Yuksel
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2020, 56
  • [3] A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN for Object Detection
    Duy Thanh Nguyen
    Tuan Nghia Nguyen
    Kim, Hyun
    Lee, Hyuk-Jae
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2019, 27 (08) : 1861 - 1873
  • [4] Eid Omar, 2021, 2021 International Conference on Microelectronics (ICM), P270, DOI 10.1109/ICM52667.2021.9664943
  • [5] Fu Y., 2016, DEEP LEARNING INT8 O
  • [6] Unsupervised Domain Adaptation Network With Category-Centric Prototype Aligner for Biomedical Image Segmentation
    Gong, Ping
    Yu, Wenwen
    Sun, Qiuwen
    Zhao, Ruohan
    Hu, Junfeng
    [J]. IEEE ACCESS, 2021, 9 : 36500 - 36511
  • [7] Guohe Zhang, 2019, 2019 IEEE International Conference of Intelligent Applied Systems on Engineering (ICIASE). Proceedings, P9, DOI 10.1109/ICIASE45644.2019.9074051
  • [8] Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
    Jacob, Benoit
    Kligys, Skirmantas
    Chen, Bo
    Zhu, Menglong
    Tang, Matthew
    Howard, Andrew
    Adam, Hartwig
    Kalenichenko, Dmitry
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2704 - 2713
  • [9] A Survey of Deep Learning-Based Object Detection
    Jiao, Licheng
    Zhang, Fan
    Liu, Fang
    Yang, Shuyuan
    Li, Lingling
    Feng, Zhixi
    Qu, Rong
    [J]. IEEE ACCESS, 2019, 7 : 128837 - 128868
  • [10] Privacy-Preserving Object Detection for Medical Images With Faster R-CNN
    Liu, Yang
    Ma, Zhuo
    Liu, Ximeng
    Ma, Siqi
    Ren, Kui
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2022, 17 : 69 - 84