Filter pruning via expectation-maximization

被引:1
作者
Xu, Sheng [1 ]
Li, Yanjing [2 ]
Yang, Linlin [3 ]
Zhang, Baochang [1 ,4 ]
Sun, Dianmin [5 ]
Liu, Kexin [1 ,6 ]
机构
[1] Beihang Univ, Sch Automat Sci & Elect Engn, Beijing 100191, Peoples R China
[2] Beihang Univ, Sch Elect & Informat Engn, Beijing 100191, Peoples R China
[3] Univ Bonn, Inst Comp Sci 2, Bonn, Germany
[4] Nanchang Inst Technol, Nanchang, Jiangxi, Peoples R China
[5] Shandong First Med Univ & Shandong Acad Med Sci, Shandong Canc Hosp & Inst, Dept Thorac Surg, Jinan 250117, Shandong, Peoples R China
[6] Beihang Univ, State Key Lab Software Dev Environm, Beijing Adv Innovat Ctr Big Data & Brain Comp, Beijing 100191, Peoples R China
关键词
Expectation maximization; CNN compression; CNN pruning;
D O I
10.1007/s00521-022-07127-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The redundancy in convolutional neural networks (CNNs) causes a significant number of extra parameters resulting in increased computation and less diverse filters. In this paper, we introduce filter pruning via expectation-maximization (FPEM) to trim redundant structures and improve the diversity of remaining structures. Our method is designed based on the discovery that the filter diversity of pruned networks is positively correlated with its performance. The expectation step divides filters into groups by maximum likelihood layer-wisely, and averages the output feature maps for each cluster. The maximization step calculates the likelihood estimation of clusters and formulates a loss function to make the distributions in the same cluster consistent. After training, the intra-cluster redundant filters can be trimmed and only intra-cluster diverse filters are retained. Experiments conducted on CIFAR-10 have outperformed the corresponding full models. On ImageNet ILSVRC12, FPEM reduces 46.5% FLOPs on ResNet-50 with only 0.36% Top-1 accuracy decrease, which advances the state-of-arts. In particular, the FPEM offers strong generalization performance on the object detection task.
引用
收藏
页码:12807 / 12818
页数:12
相关论文
共 44 条
[1]  
Denil M., 2013, Adv. Neural Inf. Process. Syst.
[2]  
Denton E, 2014, ADV NEUR IN, V27
[3]   Centripetal SGD for Pruning Very Deep Convolutional Networks With Complicated Structure [J].
Ding, Xiaohan ;
Ding, Guiguang ;
Guo, Yuchen ;
Han, Jungong .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4938-4948
[4]   More is Less: A More Complicated Network with Less Inference Complexity [J].
Dong, Xuanyi ;
Huang, Junshi ;
Yang, Yi ;
Yan, Shuicheng .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1895-1903
[5]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[6]  
GIRSHICK R, 2014, PROC CVPR IEEE, P580, DOI DOI 10.1109/CVPR.2014.81
[7]  
Gu JX, 2019, AAAI CONF ARTIF INTE, P8344
[8]  
Han S, 2015, ADV NEUR IN, V28
[9]   Identity Mappings in Deep Residual Networks [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :630-645
[10]  
HE KM, 2016, PROC CVPR IEEE, P770, DOI DOI 10.1109/CVPR.2016.90