A Block-Floating-Point Arithmetic Based FPGA Accelerator for Convolutional Neural Networks

被引:0
|
作者
Zhang, Heshan [1 ]
Liu, Zhenyu [2 ]
Zhang, Guanwen [1 ]
Dai, Jiwu [1 ]
Lian, Xiaocong [3 ]
Zhou, Wei [1 ]
Ji, Xiangyang [3 ]
机构
[1] Northwestern Polytech Univ, Sch Elect & Informat, Xian, Peoples R China
[2] Tsinghua Univ, RIIT&TNList, Beijing, Peoples R China
[3] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
来源
2019 7TH IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (IEEE GLOBALSIP) | 2019年
基金
中国国家自然科学基金;
关键词
CNN; FPGA; block-floating-point;
D O I
10.1109/globalsip45357.2019.8969292
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional neural networks (CNNs) have been widely used in computer vision applications and achieved great success. However, large-scale CNN models usually consume a lot of computing and memory resources, which makes it difficult for them to be deployed on embedded devices. An efficient block-floating-point (BFP) arithmetic is proposed in this paper. compared with 32-bit floating-point arithmetic, the memory and off-chip bandwidth requirements during convolution are reduced by 50% and 72.37%, respectively. Due to the adoption of BFP arithmetic, the complex multiplication and addition operations of floating-point numbers can be replaced by the corresponding operations of fixed-point numbers, which is more efficient on hardware. A CNN model can be deployed on our accelerator with no more than 0.14% top-1 accuracy loss, and there is no need for retraining and fine-tuning. By employing a series of ping-pong memory access schemes, 2-dimensional propagate partial multiply-accumulate (PPMAC) processors, and an optimized memory system, we implemented a CNN accelerator on Xilinx VC709 evaluation board. The accelerator achieves a performance of 665.54 GOP/s and a power efficiency of 89.7 GOP/s/W under a 300 MHz working frequency, which outperforms previous FPGA based accelerators significantly.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] High Speed, Approximate Arithmetic Based Convolutional Neural Network Accelerator
    Elbtity, Mohammed E.
    Son, Hyun-Wook
    Lee, Dong-Yeong
    Kim, HyungWon
    2020 17TH INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC 2020), 2020, : 71 - 72
  • [42] Real-Time Fixed-Point Hardware Accelerator of Convolutional Neural Network on FPGA Based
    Ozkilbac, Bahadir
    Ozbek, Ibrahim Yucel
    Karacali, Tevhit
    5TH INTERNATIONAL CONFERENCE ON COMPUTING AND INFORMATICS (ICCI 2022), 2022, : 1 - 5
  • [43] A GPU-Outperforming FPGA Accelerator Architecture for Binary Convolutional Neural Networks
    Li, Yixing
    Liu, Zichuan
    Xu, Kai
    Yu, Hao
    Ren, Fengbo
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2018, 14 (02)
  • [44] An Efficient Convolutional Neural Network Accelerator on FPGA
    Si, Junye
    Jiang, Jianfei
    Wang, Qin
    Huang, Jia
    2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1392 - 1394
  • [45] Accelerator Design with Effective Resource Utilization for Binary Convolutional Neural Networks on an FPGA
    Kim, Sunwoong
    Rutenbar, Rob A.
    PROCEEDINGS 26TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2018), 2018, : 218 - 218
  • [46] An Efficient FIFO Based Accelerator for Convolutional Neural Networks
    Panchbhaiyye, Vineet
    Ogunfunmi, Tokunbo
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (10): : 1117 - 1129
  • [47] An Efficient FIFO Based Accelerator for Convolutional Neural Networks
    Vineet Panchbhaiyye
    Tokunbo Ogunfunmi
    Journal of Signal Processing Systems, 2021, 93 : 1117 - 1129
  • [48] FPGA accelerator for floating-point matrix multiplication
    Jovanovic, Z.
    Milutinovic, V.
    IET COMPUTERS AND DIGITAL TECHNIQUES, 2012, 6 (04): : 249 - 256
  • [49] Calculation Optimization for Convolutional Neural Networks and FPGA-based Accelerator Design Using the Parameters Sparsity
    Liu Qinrang
    Liu Chongyang
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2018, 40 (06) : 1368 - 1374
  • [50] A High-efficiency FPGA-based Accelerator for Convolutional Neural Networks using Winograd Algorithm
    Huang, Y.
    Shen, J.
    Wang, Z.
    Wen, M.
    Zhang, C.
    2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, COMMUNICATIONS AND CONTROL ENGINEERING (ICECC), 2018, 1026