Efficient Layer Compression Without Pruning

被引:10
作者
Wu, Jie [1 ]
Zhu, Dingshun [1 ]
Fang, Leyuan [1 ,2 ]
Deng, Yue [3 ,4 ,5 ]
Zhong, Zhun [6 ]
机构
[1] Hunan Univ, Coll Elect & Informat Engn, Changsha 410082, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518000, Peoples R China
[3] Beihang Univ, Sch Astronaut, Beijing 102206, Peoples R China
[4] Sci & Technol Space Intelligent Control Lab, Beijing 100191, Peoples R China
[5] Beihang Univ, Inst Artificial Intelligence, Adv Innovat Ctr Big Data & Brain Comp, Beijing 100191, Peoples R China
[6] Univ Nottingham, Sch Comp Sci, Nottingham NG8 1BB, England
关键词
Deep neural networks; layer compression; pruning; image classification;
D O I
10.1109/TIP.2023.3302519
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Network pruning is one of the chief means for improving the computational efficiency of Deep Neural Networks (DNNs). Pruning-based methods generally discard network kernels, channels, or layers, which however inevitably will disrupt original well-learned network correlation and thus lead to performance degeneration. In this work, we propose an Efficient Layer Compression (ELC) approach to efficiently compress serial layers by decoupling and merging rather than pruning. Specifically, we first propose a novel decoupling module to decouple the layers, enabling us readily merge serial layers that include both nonlinear and convolutional layers. Then, the decoupled network is losslessly merged based on the equivalent conversion of the parameters. In this way, our ELC can effectively reduce the depth of the network without destroying the correlation of the convolutional layers. To our best knowledge, we are the first to exploit the mergeability of serial convolutional layers for lossless network layer compression. Experimental results conducted on two datasets demonstrate that our method retains superior performance with a FLOPs reduction of 74.1% for VGG-16 and 54.6% for ResNet-56, respectively. In addition, our ELC improves the inference speed by 2x on Jetson AGX Xavier edge device.
引用
收藏
页码:4689 / 4700
页数:12
相关论文
共 66 条
  • [21] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [22] He Y, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2234
  • [23] Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration
    He, Yang
    Ding, Yuhang
    Liu, Ping
    Zhu, Linchao
    Zhang, Hanwang
    Yang, Yi
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2006 - 2015
  • [24] Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration
    He, Yang
    Liu, Ping
    Wang, Ziwei
    Hu, Zhilan
    Yang, Yi
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4335 - 4344
  • [25] Channel Pruning for Accelerating Very Deep Neural Networks
    He, Yihui
    Zhang, Xiangyu
    Sun, Jian
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1398 - 1406
  • [26] Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]
  • [27] Densely Connected Convolutional Networks
    Huang, Gao
    Liu, Zhuang
    van der Maaten, Laurens
    Weinberger, Kilian Q.
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2261 - 2269
  • [28] Data-Driven Sparse Structure Selection for Deep Neural Networks
    Huang, Zehao
    Wang, Naiyan
    [J]. COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 : 317 - 334
  • [29] Coarse-to-Fine Semantic Segmentation From Image-Level Labels
    Jing, Longlong
    Chen, Yucheng
    Tian, Yingli
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 225 - 236
  • [30] Discriminative Layer Pruning for Convolutional Neural Networks
    Jordao, Artur
    Lie, Maiko
    Schwartz, William Robson
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (04) : 828 - 837