An Efficient FPGA-based Depthwise Separable Convolutional Neural Network Accelerator with Hardware Pruning

被引:4
|
作者
Liu, Zhengyan [1 ]
Liu, Qiang [1 ]
Yan, Shun [1 ]
Cheung, Ray C. C. [2 ]
机构
[1] Tianjin Univ, Sch Microelect, 92nd Rd, Tianjin 300072, Nankai, Peoples R China
[2] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
CNN accelerator; depthwise-seperable convolution; bottleneck; model compression;
D O I
10.1145/3615661
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional neural networks (CNNs) have been widely deployed in computer vision tasks. However, the computation and resource intensive characteristics of CNN bring obstacles to its application on embedded systems. This article proposes an efficient inference accelerator on Field Programmable Gate Array (FPGA) for CNNs with depthwise separable convolutions. To improve the accelerator efficiency, we make four contributions: (1) an efficient convolution engine with multiple strategies for exploiting parallelism and a configurable adder tree are designed to support three types of convolution operations; (2) a dedicated architecture combined with input buffers is designed for the bottleneck network structure to reduce data transmission time; (3) a hardware padding scheme to eliminate invalid padding operations is proposed; and (4) a hardware-assisted pruning method is developed to support online tradeoff between model accuracy and power consumption. Experimental results show that for MobileNetV2 the accelerator achieves 10x and 6x energy efficiency improvement over the CPU and GPU implementation, and 302.3 frames per second and 181.8 GOPS performance that is the best among several existing single-engine accelerators on FPGAs. The proposed hardware-assisted pruning method can effectively reduce 59.7% power consumption at the accuracy loss within 5%.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] An FPGA-based Accelerator Implementation for Deep Convolutional Neural Networks
    Zhou, Yongmei
    Jiang, Jingfei
    PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 829 - 832
  • [32] A High Utilization FPGA-Based Accelerator for Variable-Scale Convolutional Neural Network
    Li, Xin
    Cai, Yujie
    Han, Jun
    Zeng, Xiaoyang
    2017 IEEE 12TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2017, : 944 - 947
  • [33] Composite FPGA-based Accelerator for Deep Convolutional Neural Networks
    HuanZhang
    YuanYang
    YangXiao
    2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRON DEVICES AND SOLID-STATE CIRCUITS (EDSSC), 2019,
  • [34] An FPGA-Based Approach for Compressing and Accelerating Depthwise Separable Convolution
    Yang, Ruiheng
    Chen, Zhikun
    Hu, Lingtong
    Cui, Xihang
    Guo, Yunfei
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2590 - 2594
  • [35] Probability-Based Channel Pruning for Depthwise Separable Convolutional Networks
    Zhao, Han-Li
    Shi, Kai-Jie
    Jin, Xiao-Gang
    Xu, Ming-Liang
    Huang, Hui
    Lu, Wang-Long
    Liu, Ying
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2022, 37 (03) : 584 - 600
  • [36] Binarized Depthwise Separable Neural Network for Object Tracking in FPGA
    Yang, Li
    He, Zhezhi
    Fan, Deliang
    GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 347 - 350
  • [37] A reconfigurable FPGA-based spiking neural network accelerator
    Yin, Mingqi
    Cui, Xiaole
    Wei, Feng
    Liu, Hanqing
    Jiang, Yuanyuan
    Cui, Xiaoxin
    MICROELECTRONICS JOURNAL, 2024, 152
  • [38] VWA: Hardware Efficient Vectorwise Accelerator for Convolutional Neural Network
    Chang, Kuo-Wei
    Chang, Tian-Sheuan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2020, 67 (01) : 145 - 154
  • [39] Depthwise Separable Convolutional Neural Network for Skin Lesion Classification
    Kassani, Sara Hosseinzadeh
    Kassani, Peyman Hosseinzadeh
    Wesolowski, Michal J.
    Schneider, Kevin A.
    Deters, Ralph
    2019 IEEE 19TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2019), 2019,
  • [40] Traffic Scene Depth Analysis Based on Depthwise Separable Convolutional Neural Network
    Yuan, Jianzhong
    Zhou, Wujie
    Lv, Sijia
    Chen, Yuzhen
    JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2019, 2019