Retraining-free methods for fast on-the-fly pruning of convolutional neural networks

被引:11
作者
Ashouri, Amir H. [1 ]
Abdelrahman, Tarek S. [2 ,3 ]
Dos Remedios, Alwyn [4 ]
机构
[1] Univ Toronto, ECE Dept, Toronto, ON, Canada
[2] Univ Toronto, Elect & Comp Engn, Toronto, ON, Canada
[3] Univ Toronto, Comp Sci, Toronto, ON, Canada
[4] Qualcomm Inc, Markham, ON, Canada
关键词
Deep learning; Convolutional neural networks; Sparsity; Pruning;
D O I
10.1016/j.neucom.2019.08.063
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We explore retraining-free pruning of CNNs. We propose and evaluate three model-independent methods for sparsification of model weights. Our methods are magnitude-based, efficient, and can be applied on-the-fly during model load time, which is necessary in some deployment contexts. We evaluate the effectiveness of these methods in introducing sparsity with minimal loss of inference accuracy using five state-of-the-art pretrained CNNs. The evaluation shows that the methods reduce the number of weights by up to 73% (i.e., compression factor of 3.7 x) without incurring more than 5% loss in Top-5 accuracy. These results also hold for quantized versions of the CNNs. We develop a classifier to determine which of the three methods is most suited for a given model. Finally, we employ additional, but impractical in our deployment context, fine-tuning and show that it gains only 8% in sparsity. This indicates that our on-the-fly methods capture much of the sparsity than can be attained without retraining, yet remain efficient and straight-forward to use. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:56 / 69
页数:14
相关论文
共 50 条
[31]   PRUNING OF CONVOLUTIONAL NEURAL NETWORKS USING ISING ENERGY MODEL [J].
Salehinejad, Hojjat ;
Valaee, Shahrokh .
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, :3935-3939
[32]   Soft Taylor Pruning for Accelerating Deep Convolutional Neural Networks [J].
Rong, Jintao ;
Yu, Xiyi ;
Zhang, Mingyang ;
Ou, Linlin .
IECON 2020: THE 46TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2020, :5343-5349
[33]   1xN Pattern for Pruning Convolutional Neural Networks [J].
Lin, Mingbao ;
Zhang, Yuxin ;
Li, Yuchao ;
Chen, Bohong ;
Chao, Fei ;
Wang, Mengdi ;
Li, Shen ;
Tian, Yonghong ;
Ji, Rongrong .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) :3999-4008
[34]   Compressing Convolutional Neural Networks by Pruning Density Peak Filters [J].
Jang, Yunseok ;
Lee, Sangyoun ;
Kim, Jaeseok .
IEEE ACCESS, 2021, 9 :8278-8285
[35]   FAST GRAPH CONVOLUTIONAL RECURRENT NEURAL NETWORKS [J].
Kadambari, Sai Kiran ;
Chepuri, Sundeep Prabhakar .
CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, :467-471
[36]   Pruning Ratio Optimization with Layer-Wise Pruning Method for Accelerating Convolutional Neural Networks [J].
Kamma, Koji ;
Inoue, Sarimu ;
Wada, Toshikazu .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (01) :161-169
[37]   Self-distillation enhanced adaptive pruning of convolutional neural networks [J].
Diao, Huabin ;
Li, Gongyan ;
Xu, Shaoyun ;
Kong, Chao ;
Wang, Wei ;
Liu, Shuai ;
He, Yuefeng .
PATTERN RECOGNITION, 2025, 157
[38]   Filter Pruning for Efficient Transfer Learning in Deep Convolutional Neural Networks [J].
Reinhold, Caique ;
Roisenberg, Mauro .
ARTIFICIAL INTELLIGENCEAND SOFT COMPUTING, PT I, 2019, 11508 :191-202
[39]   Channel pruning based on mean gradient for accelerating Convolutional Neural Networks [J].
Liu, Congcong ;
Wu, Huaming .
SIGNAL PROCESSING, 2019, 156 :84-91
[40]   TOWARDS THINNER CONVOLUTIONAL NEURAL NETWORKS THROUGH GRADUALLY GLOBAL PRUNING [J].
Wang, Zhengtao ;
Zhu, Ce ;
Xia, Zhiqiang ;
Guo, Qi ;
Liu, Yipeng .
2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, :3939-3943