An Efficient FPGA-based Depthwise Separable Convolutional Neural Network Accelerator with Hardware Pruning

被引：4

作者：

Liu, Zhengyan ^{[1
]}

Liu, Qiang ^{[1
]}

Yan, Shun ^{[1
]}

Cheung, Ray C. C. ^{[2
]}

机构：

[1] Tianjin Univ, Sch Microelect, 92nd Rd, Tianjin 300072, Nankai, Peoples R China

[2] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China

来源：

ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS | 2024年 / 17卷 / 01期

基金：

中国国家自然科学基金;

关键词：

CNN accelerator; depthwise-seperable convolution; bottleneck; model compression;

D O I：

10.1145/3615661

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Convolutional neural networks (CNNs) have been widely deployed in computer vision tasks. However, the computation and resource intensive characteristics of CNN bring obstacles to its application on embedded systems. This article proposes an efficient inference accelerator on Field Programmable Gate Array (FPGA) for CNNs with depthwise separable convolutions. To improve the accelerator efficiency, we make four contributions: (1) an efficient convolution engine with multiple strategies for exploiting parallelism and a configurable adder tree are designed to support three types of convolution operations; (2) a dedicated architecture combined with input buffers is designed for the bottleneck network structure to reduce data transmission time; (3) a hardware padding scheme to eliminate invalid padding operations is proposed; and (4) a hardware-assisted pruning method is developed to support online tradeoff between model accuracy and power consumption. Experimental results show that for MobileNetV2 the accelerator achieves 10x and 6x energy efficiency improvement over the CPU and GPU implementation, and 302.3 frames per second and 181.8 GOPS performance that is the best among several existing single-engine accelerators on FPGAs. The proposed hardware-assisted pruning method can effectively reduce 59.7% power consumption at the accuracy loss within 5%.

引用

页数：20

共 50 条

[41] An Efficient FPGA-Based Convolutional Neural Network for Classification: Ad-MobileNet
Bouguezzi, Safa
Ben Fredj, Hana
Belabed, Tarek
Valderrama, Carlos
Faiedh, Hassene
Souani, Chokri
ELECTRONICS, 2021, 10 (18)
[42] Probability-Based Channel Pruning for Depthwise Separable Convolutional Networks
Han-Li Zhao
Kai-Jie Shi
Xiao-Gang Jin
Ming-Liang Xu
Hui Huang
Wang-Long Lu
Ying Liu
Journal of Computer Science and Technology, 2022, 37 : 584 - 600
[43] An Efficient FPGA-Based Architecture for Convolutional Neural Networks
Hwang, Wen-Jyi
Jhang, Yun-Jie
Tai, Tsung-Ming
2017 40TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2017, : 582 - 588
[44] Depthwise Separable Convolutional Neural Network for Confidential Information Analysis
Lu, Yue
Jiang, Jianguo
Yu, Min
Liu, Chao
Liu, Chaochao
Huang, Weiqing
Lv, Zhiqiang
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2020), PT II, 2020, 12275 : 450 - 461
[45] Energy-Efficient and High-Throughput FPGA-based Accelerator for Convolutional Neural Networks
Feng, Gan
Hu, Zuyi
Chen, Song
Wu, Feng
2016 13TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2016, : 624 - 626
[46] An FPGA-Based Reconfigurable Convolutional Neural Network Accelerator for Tiny YOLO-V3
Tsai, Tsung-Han
Tung, Nai-Chieh
Chen, Chun-Yu
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2025, : 3388 - 3409
[47] Design of FPGA-Based Accelerator for Convolutional Neural Network under Heterogeneous Computing Framework with OpenCL
Luo, Li
Wu, Yakun
Qiao, Fei
Yang, Yi
Wei, Qi
Zhou, Xiaobo
Fan, Yongkai
Xu, Shuzheng
Liu, Xinjun
Yang, Huazhong
INTERNATIONAL JOURNAL OF RECONFIGURABLE COMPUTING, 2018, 2018
[48] FPGA-Based Deep Convolutional Neural Network Accelerator Design Techniques for the Handwritten Number Recognizer
Yoo, Yechan
Park, Yoonjin
Kim, Injung
Yi, Kang
ADVANCED SCIENCE LETTERS, 2018, 24 (03) : 2152 - 2155
[49] Optimizing FPGA-Based Convolutional Neural Network Performance
Kao, Chi-Chou
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2023, 32 (15)
[50] FPGA-based Convolutional Neural Network Design and Implementation
Yan, Ruitao
Yi, Jianjun
He, Jie
Zhao, Yifan
2023 3RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE, ACCTCS, 2023, : 456 - 460

← 1 2 3 4 5 →