Efficient Layer Compression Without Pruning

被引:10
作者
Wu, Jie [1 ]
Zhu, Dingshun [1 ]
Fang, Leyuan [1 ,2 ]
Deng, Yue [3 ,4 ,5 ]
Zhong, Zhun [6 ]
机构
[1] Hunan Univ, Coll Elect & Informat Engn, Changsha 410082, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518000, Peoples R China
[3] Beihang Univ, Sch Astronaut, Beijing 102206, Peoples R China
[4] Sci & Technol Space Intelligent Control Lab, Beijing 100191, Peoples R China
[5] Beihang Univ, Inst Artificial Intelligence, Adv Innovat Ctr Big Data & Brain Comp, Beijing 100191, Peoples R China
[6] Univ Nottingham, Sch Comp Sci, Nottingham NG8 1BB, England
关键词
Deep neural networks; layer compression; pruning; image classification;
D O I
10.1109/TIP.2023.3302519
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Network pruning is one of the chief means for improving the computational efficiency of Deep Neural Networks (DNNs). Pruning-based methods generally discard network kernels, channels, or layers, which however inevitably will disrupt original well-learned network correlation and thus lead to performance degeneration. In this work, we propose an Efficient Layer Compression (ELC) approach to efficiently compress serial layers by decoupling and merging rather than pruning. Specifically, we first propose a novel decoupling module to decouple the layers, enabling us readily merge serial layers that include both nonlinear and convolutional layers. Then, the decoupled network is losslessly merged based on the equivalent conversion of the parameters. In this way, our ELC can effectively reduce the depth of the network without destroying the correlation of the convolutional layers. To our best knowledge, we are the first to exploit the mergeability of serial convolutional layers for lossless network layer compression. Experimental results conducted on two datasets demonstrate that our method retains superior performance with a FLOPs reduction of 74.1% for VGG-16 and 54.6% for ResNet-56, respectively. In addition, our ELC improves the inference speed by 2x on Jetson AGX Xavier edge device.
引用
收藏
页码:4689 / 4700
页数:12
相关论文
共 66 条
  • [1] Baker B., 2017, ICLR
  • [2] Shallowing Deep Networks: Layer-wise Pruning based on Feature Representations
    Chen, Shi
    Zhao, Qi
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (12) : 3048 - 3056
  • [3] Chen WL, 2015, PR MACH LEARN RES, V37, P2285
  • [4] Towards Efficient Model Compression via Learned Global Ranking
    Chin, Ting-Wu
    Ding, Ruizhou
    Zhang, Cha
    Marculescu, Diana
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1515 - 1525
  • [5] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [6] Diao E., 2022, P INT C LEARN REPR, P1
  • [7] Where to Prune: Using LSTM to Guide Data-Dependent Soft Pruning
    Ding, Guiguang
    Zhang, Shuo
    Jia, Zizhou
    Zhong, Jing
    Han, Jungong
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 293 - 304
  • [8] ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting
    Ding, Xiaohan
    Hao, Tianxiang
    Tan, Jianchao
    Liu, Ji
    Han, Jungong
    Guo, Yuchen
    Ding, Guiguang
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 4490 - 4500
  • [9] RepVGG: Making VGG-style ConvNets Great Again
    Ding, Xiaohan
    Zhang, Xiangyu
    Ma, Ningning
    Han, Jungong
    Ding, Guiguang
    Sun, Jian
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13728 - 13737
  • [10] Dynamic Image Quantization Using Leaky Integrate-and-Fire Neurons
    Doutsi, Effrosyni
    Fillatre, Lionel
    Antonini, Marc
    Tsakalides, Panagiotis
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 4305 - 4315