V-SKP: Vectorized Kernel-Based Structured Kernel Pruning for Accelerating Deep Convolutional Neural Networks

被引：2

作者：

Koo, Kwanghyun ^{[1
]}

Kim, Hyun ^{[1
]}

机构：

[1] Seoul Natl Univ Sci & Technol, Res Ctr Elect & Informat Technol, Dept Elect & Informat Engn, Seoul 01811, South Korea

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Kernel pruning; convolutional neural networks; vectorized kernel; network compression;

D O I：

10.1109/ACCESS.2023.3326534

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, kernel pruning, which offers the advantages of both weight and filter pruning methods, has been actively conducted. Although kernel pruning must be implemented as structured pruning to obtain the actual network acceleration effect on GPUs, most existing methods are limited in that they have been implemented as unstructured pruning. To compensate for this problem, we propose vectorized kernel-based structured kernel pruning (V-SKP), which has a high FLOPs reduction effect with minimal reduction in accuracy while maintaining a 4D weight structure. V-SKP treats the kernel of the convolution layer as a vector and performs pruning by extracting the feature from each filter vector of the convolution layer. Conventional L1/L2 norm-based pruning considers only the size of the vector and removes parameters without considering the direction of the vector, whereas V-SKP removes the kernel by considering both the feature of the filter vector and the size and direction of the vectorized kernel. Moreover, because the kernel-pruned weight cannot be utilized when using the typical convolution, in this study, the kernel-pruned weights and the input channels are matched by compressing and storing the retained kernel index in the kernel index set during the proposed kernel pruning scheme. In addition, a kernel index convolution method is proposed to perform convolution operations by matching the input channels with the kernel-pruned weights on the GPU structure. Experimental results show that V-SKP achieves a significant level of parameter and FLOPs reduction with acceptable accuracy degradation in various networks, including ResNet-50, and facilitates real acceleration effects on the GPUs, unlike conventional kernel pruning techniques.

引用

页码：118547 / 118557

页数：11

共 28 条

[1]

[Anonymous], Lecture 6a Overview of mini batch gradient descent

[2] Towards Efficient Model Compression via Learned Global Ranking [J].

Chin, Ting-Wu ;

Ding, Ruizhou ;

Zhang, Cha ;

Marculescu, Diana .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :1515-1525

[3]

Ding XH, 2019, Arxiv, DOI [arXiv:1909.12778, DOI 10.48550/ARXIV.1909.12778]

[4]

Ding XH, 2021, Arxiv, DOI arXiv:2007.03260

[5] Centripetal SGD for Pruning Very Deep Convolutional Networks With Complicated Structure [J].

Ding, Xiaohan ;

Ding, Guiguang ;

Guo, Yuchen ;

Han, Jungong .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4938-4948

[6] Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration [J].

He, Yang ;

Ding, Yuhang ;

Liu, Ping ;

Zhu, Linchao ;

Zhang, Hanwang ;

Yang, Yi .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2006-2015

[7] Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration [J].

He, Yang ;

Liu, Ping ;

Wang, Ziwei ;

Hu, Zhilan ;

Yang, Yi .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4335-4344

[8] Accelerator-Aware Pruning for Convolutional Neural Networks [J].

Kang, Hyeong-Ju .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (07) :2093-2103

[9]

Kingma DP, 2014, ADV NEUR IN, V27

[10]

Koo K, 2023, PROC IEEE 5 INT C AR, P1

← 1 2 3 →