Less Is More: Adaptive Trainable Gradient Dropout for Deep Neural Networks

被引:4
作者
Avgerinos, Christos [1 ]
Vretos, Nicholas [1 ]
Daras, Petros [1 ]
机构
[1] Ctr Res & Technol Hellas CERTH, Informat Technol Inst ITI, Thessaloniki 57001, Greece
关键词
adaptive dropout; gradient dropout; gradient freezing; trainable dropout;
D O I
10.3390/s23031325
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
The undeniable computational power of artificial neural networks has granted the scientific community the ability to exploit the available data in ways previously inconceivable. However, deep neural networks require an overwhelming quantity of data in order to interpret the underlying connections between them, and therefore, be able to complete the specific task that they have been assigned to. Feeding a deep neural network with vast amounts of data usually ensures efficiency, but may, however, harm the network's ability to generalize. To tackle this, numerous regularization techniques have been proposed, with dropout being one of the most dominant. This paper proposes a selective gradient dropout method, which, instead of relying on dropping random weights, learns to freeze the training process of specific connections, thereby increasing the overall network's sparsity in an adaptive manner, by driving it to utilize more salient weights. The experimental results show that the produced sparse network outperforms the baseline on numerous image classification datasets, and additionally, the yielded results occurred after significantly less training epochs.
引用
收藏
页数:12
相关论文
共 32 条
  • [1] RELATIONSHIP BETWEEN VARIABLE SELECTION AND DATA AUGMENTATION AND A METHOD FOR PREDICTION
    ALLEN, DM
    [J]. TECHNOMETRICS, 1974, 16 (01) : 125 - 127
  • [2] [Anonymous], 2013, P 30 INT C MACH LEAR
  • [3] Ba J., 2013, P ADV NEURAL INFORM, V26
  • [4] Bagging predictors
    Breiman, L
    [J]. MACHINE LEARNING, 1996, 24 (02) : 123 - 140
  • [5] Coates A., 2011, INT C ARTIFICIAL INT
  • [6] AutoAugment: Learning Augmentation Strategies from Data
    Cubuk, Ekin D.
    Zoph, Barret
    Mane, Dandelion
    Vasudevan, Vijay
    Le, Quoc V.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 113 - 123
  • [7] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [8] DeVries T., 2017, ARXIV
  • [9] BOOSTING A WEAK LEARNING ALGORITHM BY MAJORITY
    FREUND, Y
    [J]. INFORMATION AND COMPUTATION, 1995, 121 (02) : 256 - 285
  • [10] Fundamental Technologies in Modern Speech Recognition
    Furui, Sadaoki
    Deng, Li
    Gales, Mark
    Ney, Hermann
    Tokuda, Keiichi
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 16 - 17