Self-distillation enhanced adaptive pruning of convolutional neural networks

被引:3
作者
Diao, Huabin [1 ]
Li, Gongyan [2 ]
Xu, Shaoyun [2 ]
Kong, Chao [1 ]
Wang, Wei [1 ]
Liu, Shuai [1 ]
He, Yuefeng [1 ]
机构
[1] Anhui Polytech Univ, Beijing Middle Rd, Wuhu 241000, Anhui, Peoples R China
[2] Chinese Acad Sci, Inst Microelect, 3 Beituocheng West Rd, Beijing 100029, Peoples R China
基金
国家重点研发计划;
关键词
Convolutional neural networks; Self-distillation; Adaptive pruning;
D O I
10.1016/j.patcog.2024.110942
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional neural networks (CNNs) suffer from issues of large parameter size and high computational complexity. To address this, we propose an adaptive pruning algorithm based on self-distillation. The algorithm introduces a trainable parameter for each channel to control channel pruning and integrates the pruning process into network training, enabling pruning and fine-tuning in a single training iteration to derive the final pruned model. Moreover, this framework requires only a single overall pruning rate to achieve adaptive pruning for each layer, avoiding tedious hyperparameter tuning for a less iterative, simple and efficient pruning process. Additionally, self-distillation is utilized in the pruning algorithm, leveraging the knowledge from the pretrained CNN to guide its own pruning process, facilitating the recovery from performance degradation caused by pruning and achieving higher accuracy. Extensive pruning experiments on various CNN models over different datasets demonstrate that at least 75% of redundant parameters can be reduced without sacrificing model accuracy.
引用
收藏
页数:11
相关论文
共 35 条
[1]  
Baevski A, 2020, ADV NEUR IN, V33
[2]  
Balles L, 2018, PR MACH LEARN RES, V80
[3]  
Chrabaszcz P, 2017, Arxiv, DOI [arXiv:1707.08819, DOI 10.48550/ARXIV.1707.08819]
[4]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[5]   Attention Round for post-training quantization [J].
Diao, Huabin ;
Li, Gongyan ;
Xu, Shaoyun ;
Kong, Chao ;
Wang, Wei .
NEUROCOMPUTING, 2024, 565
[6]   PA-NAS: Partial operation activation for memory-efficient architecture search [J].
Diao, Huabin ;
Li, Gongyan ;
Xu, Shaoyun ;
Hao, Yuexing .
APPLIED INTELLIGENCE, 2022, 52 (08) :9373-9387
[7]   Implementation of Lightweight Convolutional Neural Networks via Layer-Wise Differentiable Compression [J].
Diao, Huabin ;
Hao, Yuexing ;
Xu, Shaoyun ;
Li, Gongyan .
SENSORS, 2021, 21 (10)
[8]   An Automatically Layer-Wise Searching Strategy for Channel Pruning Based on Task-Driven Sparsity Optimization [J].
Feng, Kai-Yuan ;
Fei, Xia ;
Gong, Maoguo ;
Qin, A. K. ;
Li, Hao ;
Wu, Yue .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (09) :5790-5802
[9]  
Howard AG, 2017, Arxiv, DOI arXiv:1704.04861
[10]  
Guo YW, 2016, ADV NEUR IN, V29