V-SKP: Vectorized Kernel-Based Structured Kernel Pruning for Accelerating Deep Convolutional Neural Networks

被引:0
作者
Koo, Kwanghyun [1 ]
Kim, Hyun [1 ]
机构
[1] Seoul Natl Univ Sci & Technol, Res Ctr Elect & Informat Technol, Dept Elect & Informat Engn, Seoul 01811, South Korea
来源
IEEE ACCESS | 2023年 / 11卷
关键词
Kernel pruning; convolutional neural networks; vectorized kernel; network compression;
D O I
10.1109/ACCESS.2023.3326534
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, kernel pruning, which offers the advantages of both weight and filter pruning methods, has been actively conducted. Although kernel pruning must be implemented as structured pruning to obtain the actual network acceleration effect on GPUs, most existing methods are limited in that they have been implemented as unstructured pruning. To compensate for this problem, we propose vectorized kernel-based structured kernel pruning (V-SKP), which has a high FLOPs reduction effect with minimal reduction in accuracy while maintaining a 4D weight structure. V-SKP treats the kernel of the convolution layer as a vector and performs pruning by extracting the feature from each filter vector of the convolution layer. Conventional L1/L2 norm-based pruning considers only the size of the vector and removes parameters without considering the direction of the vector, whereas V-SKP removes the kernel by considering both the feature of the filter vector and the size and direction of the vectorized kernel. Moreover, because the kernel-pruned weight cannot be utilized when using the typical convolution, in this study, the kernel-pruned weights and the input channels are matched by compressing and storing the retained kernel index in the kernel index set during the proposed kernel pruning scheme. In addition, a kernel index convolution method is proposed to perform convolution operations by matching the input channels with the kernel-pruned weights on the GPU structure. Experimental results show that V-SKP achieves a significant level of parameter and FLOPs reduction with acceptable accuracy degradation in various networks, including ResNet-50, and facilitates real acceleration effects on the GPUs, unlike conventional kernel pruning techniques.
引用
收藏
页码:118547 / 118557
页数:11
相关论文
共 28 条
  • [1] [Anonymous], Lecture 6a Overview of mini batch gradient descent
  • [2] Towards Efficient Model Compression via Learned Global Ranking
    Chin, Ting-Wu
    Ding, Ruizhou
    Zhang, Cha
    Marculescu, Diana
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1515 - 1525
  • [3] Ding XH, 2019, Arxiv, DOI [arXiv:1909.12778, DOI 10.48550/ARXIV.1909.12778]
  • [4] Ding XH, 2021, Arxiv, DOI arXiv:2007.03260
  • [5] Centripetal SGD for Pruning Very Deep Convolutional Networks With Complicated Structure
    Ding, Xiaohan
    Ding, Guiguang
    Guo, Yuchen
    Han, Jungong
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4938 - 4948
  • [6] Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration
    He, Yang
    Ding, Yuhang
    Liu, Ping
    Zhu, Linchao
    Zhang, Hanwang
    Yang, Yi
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2006 - 2015
  • [7] Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration
    He, Yang
    Liu, Ping
    Wang, Ziwei
    Hu, Zhilan
    Yang, Yi
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4335 - 4344
  • [9] Kingma DP, 2014, ADV NEUR IN, V27
  • [10] Koo K, 2023, PROC IEEE 5 INT C AR, P1