Magnitude and Uncertainty Pruning Criterion for Neural Networks

被引:0
作者
Ko, Vinnie [1 ]
Oehmcke, Stefan [2 ]
Gieseke, Fabian [2 ]
机构
[1] Univ Oslo, Oslo, Norway
[2] Univ Copenhagen, Copenhagen, Denmark
来源
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2019年
关键词
Neural network compression; pruning; overparameterization; Wald test; MAXIMUM-LIKELIHOOD;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural networks have achieved dramatic improvements in recent years and depict the state-of-the-art methods for many real-world tasks nowadays. One drawback is, however. that many of these models are overparameterized, which makes them both computationally and memory intensive. Furthermore, overparameterization can also lead to undesired overfilling side-effects. Inspired by recently proposed magnitude-based pruning schemes and the Wald test from the field of statistics, we introduce a novel magnitude and uncertainty (M&U) pruning criterion that helps to lessen such shortcomings. One important. advantage of our M&U pruning criterion is that it is scale-invariant, a phenomenon that the magnitude-based pruning criterion suffers from. In addition, we present a "pseudo bootstrap" scheme, which can efficiently estimate the uncertainty of the weights by using their update information during training. Our experimental evaluation, which is based on various neural network architectures and datasets, shows that our new criterion leads to more compressed models compared to models that are solely based on magnitude-based pruning criteria, with, at the same time, less loss in predictive power.
引用
收藏
页码:2317 / 2326
页数:10
相关论文
共 62 条
  • [1] Model selection in neural networks
    Anders, U
    Korn, O
    [J]. NEURAL NETWORKS, 1999, 12 (02) : 309 - 323
  • [2] [Anonymous], 2017, INT C LEARN REPR ICL
  • [3] Structured Pruning of Deep Convolutional Neural Networks
    Anwar, Sajid
    Hwang, Kyuyeon
    Sung, Wonyong
    [J]. ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
  • [4] Rao's score, Neyman's C(α) and Silvey's LM tests:: an essay on historical developments and some new results
    Bera, AK
    Bilias, Y
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2001, 97 (01) : 9 - 44
  • [5] SOME ASYMPTOTIC THEORY FOR THE BOOTSTRAP
    BICKEL, PJ
    FREEDMAN, DA
    [J]. ANNALS OF STATISTICS, 1981, 9 (06) : 1196 - 1217
  • [6] Boski M, 2017, 2017 10TH INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL (ND) SYSTEMS (NDS)
  • [7] Casella G., 2002, STAT INFERENCE
  • [8] Changpinyo S., 2017, The Power of Sparsity in Convolutional Neural Networks
  • [9] Cheng Y., 2017, SURVEY MODEL COMPRES, P1
  • [10] Courbariaux M., 2015, ADV NEURAL INFORM PR, P3105