Dynamical Channel Pruning by Conditional Accuracy Change for Deep Neural Networks

被引：57

作者：

Chen, Zhiqiang ^{[1
,2
]}

Xu, Ting-Bing ^{[2
,3
]}

Du, Changde ^{[1
,2
]}

Liu, Cheng-Lin ^{[2
,3
,4
]}

He, Huiguang ^{[2
,4
,5
]}

机构：

[1] Chinese Acad Sci CASIA, Inst Automat, Res Ctr Brain Inspired Intelligence, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci UCAS, Sch Artificial Intelligence, Beijing 100049, Peoples R China

[3] Chinese Acad Sci CASIA, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China

[4] Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Beijing 100190, Peoples R China

[5] Chinese Acad Sci CASIA, Res Ctr Brain Inspired Intelligence, Natl Lab Pattern Recognit, Inst Automat, Beijing 100190, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2021年 / 32卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Training; Channel estimation; Logic gates; Computer architecture; Convolution; Biological neural networks; Automation; Conditional accuracy change (CAC); direct criterion; dynamical channel pruning; neural network compression; structure shaping;

D O I：

10.1109/TNNLS.2020.2979517

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Channel pruning is an effective technique that has been widely applied to deep neural network compression. However, many existing methods prune from a pretrained model, thus resulting in repetitious pruning and fine-tuning processes. In this article, we propose a dynamical channel pruning method, which prunes unimportant channels at the early stage of training. Rather than utilizing some indirect criteria (e.g., weight norm, absolute weight sum, and reconstruction error) to guide connection or channel pruning, we design criteria directly related to the final accuracy of a network to evaluate the importance of each channel. Specifically, a channelwise gate is designed to randomly enable or disable each channel so that the conditional accuracy changes (CACs) can be estimated under the condition of each channel disabled. Practically, we construct two effective and efficient criteria to dynamically estimate CAC at each iteration of training; thus, unimportant channels can be gradually pruned during the training process. Finally, extensive experiments on multiple data sets (i.e., ImageNet, CIFAR, and MNIST) with various networks (i.e., ResNet, VGG, and MLP) demonstrate that the proposed method effectively reduces the parameters and computations of baseline network while yielding the higher or competitive accuracy. Interestingly, if we Double the initial Channels and then Prune Half (DCPH) of them to baseline's counterpart, it can enjoy a remarkable performance improvement by shaping a more desirable structure.

引用

页码：799 / 813

页数：15

共 60 条

[1] Dopaminergic modulation of hemodynamic signal variability and the functional connectome during cognitive performance [J].

Alavash, Mohsen ;

Lim, Sung-Joo ;

Thiel, Christiane ;

Sehm, Bernhard ;

Deserno, Lorenz ;

Obleser, Jonas .

NEUROIMAGE, 2018, 172 :341-356

[2] Pruning Neural Networks Using Multi-Armed Bandits [J].

Ameen, Salem ;

Vadera, Sunil .

COMPUTER JOURNAL, 2020, 63 (07) :1099-1108

[3]

[Anonymous], 2017, ABS170404861 CORR

[4]

[Anonymous], 2015, PROC ADVNEURAL INF P

[5]

[Anonymous], 2015, Advances in neural information processing systems

[6]

[Anonymous], 2018, 29THE BRIT MACHINE V

[7]

[Anonymous], 2018, Igcv2: Interleaved structured sparse convolutional neural networks

[8]

Baker B, 2017, INT C LEARNING REPRE

[9]

Chen L.C., 2014, Semantic image segmentation with deep convolutional nets and fully connected CRFs, DOI DOI 10.48550/ARXIV.1412.7062

[10]

Chen WL, 2015, PR MACH LEARN RES, V37, P2285

← 1 2 3 4 5 6 →