An FPGA-Based Approach for Compressing and Accelerating Depthwise Separable Convolution

被引：0

作者：

Yang, Ruiheng ^{[1
]}

Chen, Zhikun ^{[1
]}

Hu, Lingtong ^{[1
]}

Cui, Xihang ^{[1
]}

Guo, Yunfei ^{[1
]}

机构：

[1] Hangzhou Dianzi Univ, Sch Automation, Sch Artificial Intelligence, Hangzhou 310018, Peoples R China

来源：

IEEE SIGNAL PROCESSING LETTERS | 2024年 / 31卷

基金：

中国国家自然科学基金;

关键词：

Convolution; Optimization; Throughput; Resource management; Quantization (signal); Parallel processing; Hardware acceleration; CLIP-Q; DSC; FPGA; hardware accelerator; CNN;

D O I：

10.1109/LSP.2024.3425286

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The rapid progress of deep learning has led to an increase in the parameter count and computational requirements of convolutional neural networks (CNN), presenting difficulties in deploying networks on hardware platforms with constrained resources. Although depthwise separable convolution (DSC) is one method used to tackle this issue, it still maintains numerous redundant parameters. Meanwhile, compression learning by in parallel pruning-quantization (CLIP-Q) method represents an efficient approach to network compression. However, it does not have additional optimization for DSC. This study proposes a method named DSC-CLIP-Q, which is derived from the CLIP-Q approach and is designed to specifically address the parameter distribution characteristics of DSC. Furthermore, the research developed a highly energy-efficient and reconfigurable hardware accelerator specifically designed for this approach. Additional storage optimizations tailored to the hardware features of DSC-CLIP-Q is introduced, in conjunction with a reconfigurable processing element (PE) array specifically designed for the convolutional characteristics of DSC. The experimental results indicate that the suggested DSC accelerator attains a high level of throughput and energy efficiency, while also enhancing network accuracy.

引用

页码：2590 / 2594

页数：5

共 50 条

[41] FPGA-based Digital Twin Approach for Design and Test
Schulz, Peter
Ungar, Louis Y.
2023 IEEE AUTOTESTCON, 2023,
[42] New approach FPGA-based implementation of discontinuous SVPWM
Sutikno, Tole
Jidin, Auzani
Idris, Nik Rumzi Nik
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2010, 18 (04) : 499 - 514
[43] Separable Convolution Gaussian Smoothing Filters on a Xilinx FPGA platform
Talbi, F.
Alim, F.
Seddiki, S.
Mezzah, I.
Hachemi, B.
FIFTH INTERNATIONAL CONFERENCE ON THE INNOVATIVE COMPUTING TECHNOLOGY (INTECH 2015), 2015, : 112 - 117
[44] CSDS: End-to-End Aerial Scenes Classification With Depthwise Separable Convolution and an Attention Mechanism
Wang, Xinyu
Yuan, Liming
Xu, Haixia
Wen, Xianbin
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 (14) : 10484 - 10499
[45] FPGA-based implementation of cuckoo search
Alfailakawi, Mohammad Gh.
El-Shafei, Mohammed
Ahmad, Imtiaz
Salman, Ayed
IET COMPUTERS AND DIGITAL TECHNIQUES, 2019, 13 (01) : 28 - 37
[46] Optimization of FPGA-based Moore FSM
Barkalov, Aleksander
Titarenko, Larysa
Chmielewski, Slawomir
INTERNATIONAL CONFERENCE OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING 2014 (ICCMSE 2014), 2014, 1618 : 134 - 137
[47] Realtime FPGA-Based Vision System
Hirai, Shinichi
Zakoji, Masakazu
Masubuchi, Akihiro
Tsuboi, Tatsuhiko
JOURNAL OF ROBOTICS AND MECHATRONICS, 2005, 17 (04) : 401 - 409
[48] Programmable FPGA-based Memory Controller
Wijeratne, Sasindu
Pattnaik, Sanket
Chen, Zhiyu
Kannan, Rajgopal
Prasanna, Viktor
2021 IEEE SYMPOSIUM ON HIGH-PERFORMANCE INTERCONNECTS (HOTI 2021), 2021, : 43 - 51
[49] Algorithmic methodologies for FPGA-based vision
Yoong Kang Lim
Lindsay Kleeman
Tom Drummond
Machine Vision and Applications, 2013, 24 : 1197 - 1211
[50] A Low-Cost Neural ODE with Depthwise Separable Convolution for Edge Domain Adaptation on FPGAs
Kawakami, Hiroki
Watanabe, Hirohisa
Sugiura, Keisuke
Matsutani, Hiroki
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (07) : 1186 - 1197

← 1 2 3 4 5 →