An FPGA-Based Approach for Compressing and Accelerating Depthwise Separable Convolution

被引：0

作者：

Yang, Ruiheng ^{[1
]}

Chen, Zhikun ^{[1
]}

Hu, Lingtong ^{[1
]}

Cui, Xihang ^{[1
]}

Guo, Yunfei ^{[1
]}

机构：

[1] Hangzhou Dianzi Univ, Sch Automation, Sch Artificial Intelligence, Hangzhou 310018, Peoples R China

来源：

IEEE SIGNAL PROCESSING LETTERS | 2024年 / 31卷

基金：

中国国家自然科学基金;

关键词：

Convolution; Optimization; Throughput; Resource management; Quantization (signal); Parallel processing; Hardware acceleration; CLIP-Q; DSC; FPGA; hardware accelerator; CNN;

D O I：

10.1109/LSP.2024.3425286

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The rapid progress of deep learning has led to an increase in the parameter count and computational requirements of convolutional neural networks (CNN), presenting difficulties in deploying networks on hardware platforms with constrained resources. Although depthwise separable convolution (DSC) is one method used to tackle this issue, it still maintains numerous redundant parameters. Meanwhile, compression learning by in parallel pruning-quantization (CLIP-Q) method represents an efficient approach to network compression. However, it does not have additional optimization for DSC. This study proposes a method named DSC-CLIP-Q, which is derived from the CLIP-Q approach and is designed to specifically address the parameter distribution characteristics of DSC. Furthermore, the research developed a highly energy-efficient and reconfigurable hardware accelerator specifically designed for this approach. Additional storage optimizations tailored to the hardware features of DSC-CLIP-Q is introduced, in conjunction with a reconfigurable processing element (PE) array specifically designed for the convolutional characteristics of DSC. The experimental results indicate that the suggested DSC accelerator attains a high level of throughput and energy efficiency, while also enhancing network accuracy.

引用

页码：2590 / 2594

页数：5

共 50 条

[1] A High-Performance FPGA-Based Depthwise Separable Convolution Accelerator
Huang, Jiye
Liu, Xin
Guo, Tongdong
Zhao, Zhijin
ELECTRONICS, 2023, 12 (07)
[2] A CNN Accelerator on FPGA Using Depthwise Separable Convolution
Bai, Lin
Zhao, Yiming
Huang, Xinming
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2018, 65 (10) : 1415 - 1419
[3] Mobile-X: Dedicated FPGA Implementation of the MobileNet Accelerator Optimizing Depthwise Separable Convolution
Hong, Hyeonseok
Choi, Dahun
Kim, Namjoon
Kim, Hyun
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (11) : 4668 - 4672
[4] A High Throughput MobileNetV2 FPGA implementation based on a Flexible Architecture for Depthwise Separable Convolution
Knapheide, Justin
Stabernack, Benno
Kuhnke, Maximilian
2020 30TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2020, : 277 - 283
[5] An Efficient FPGA-Based Accelerator Design for Convolution
Song, Peng-Fei
Pan, Jeng-Shyang
Yang, Chun-Sheng
Lee, Chiou-Yng
2017 IEEE 8TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST), 2017, : 494 - 500
[6] Optimizing Depthwise Separable Convolution Operations on GPUs
Lu, Gangzhao
Zhang, Weizhe
Wang, Zheng
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (01) : 70 - 87
[7] An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Shao, Haikuo
Shil, Huihong
Mao, Wendong
Wang, Zhongfeng
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
[8] Resource-Efficient Optimization for FPGA-Based Convolution Accelerator
Ma, Yanhua
Xu, Qican
Song, Zerui
ELECTRONICS, 2023, 12 (20)
[9] Efficient depthwise separable convolution accelerator for classification and UAV object detection
Li, Guoqing
Zhang, Jingwei
Zhang, Meng
Wu, Ruixia
Cao, Xinye
Liu, Wenzhao
NEUROCOMPUTING, 2022, 490 : 1 - 16
[10] Improved Depthwise Separable Convolution for Transfer Learning in Fault Diagnosis
Xu, Hai
Xiao, Yongchang
Sun, Kun
Cui, Lingli
IEEE SENSORS JOURNAL, 2024, 24 (20) : 33606 - 33613

← 1 2 3 4 5 →