Less Is More: Adaptive Trainable Gradient Dropout for Deep Neural Networks

被引：4

作者：

Avgerinos, Christos ^{[1
]}

Vretos, Nicholas ^{[1
]}

Daras, Petros ^{[1
]}

机构：

[1] Ctr Res & Technol Hellas CERTH, Informat Technol Inst ITI, Thessaloniki 57001, Greece

来源：

SENSORS | 2023年 / 23卷 / 03期

关键词：

adaptive dropout; gradient dropout; gradient freezing; trainable dropout;

D O I：

10.3390/s23031325

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

The undeniable computational power of artificial neural networks has granted the scientific community the ability to exploit the available data in ways previously inconceivable. However, deep neural networks require an overwhelming quantity of data in order to interpret the underlying connections between them, and therefore, be able to complete the specific task that they have been assigned to. Feeding a deep neural network with vast amounts of data usually ensures efficiency, but may, however, harm the network's ability to generalize. To tackle this, numerous regularization techniques have been proposed, with dropout being one of the most dominant. This paper proposes a selective gradient dropout method, which, instead of relying on dropping random weights, learns to freeze the training process of specific connections, thereby increasing the overall network's sparsity in an adaptive manner, by driving it to utilize more salient weights. The experimental results show that the produced sparse network outperforms the baseline on numerous image classification datasets, and additionally, the yielded results occurred after significantly less training epochs.

引用

页数：12

共 32 条

[1] RELATIONSHIP BETWEEN VARIABLE SELECTION AND DATA AUGMENTATION AND A METHOD FOR PREDICTION
ALLEN, DM
[J]. TECHNOMETRICS, 1974, 16 (01) : 125 - 127
[2] [Anonymous], 2013, P 30 INT C MACH LEAR
[3] Ba J., 2013, P ADV NEURAL INFORM, V26
[4] Bagging predictors
Breiman, L
[J]. MACHINE LEARNING, 1996, 24 (02) : 123 - 140
[5] Coates A., 2011, INT C ARTIFICIAL INT
[6] AutoAugment: Learning Augmentation Strategies from Data
Cubuk, Ekin D.
Zoph, Barret
Mane, Dandelion
Vasudevan, Vijay
Le, Quoc V.
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 113 - 123
[7] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8] DeVries T., 2017, ARXIV
[9] BOOSTING A WEAK LEARNING ALGORITHM BY MAJORITY
FREUND, Y
[J]. INFORMATION AND COMPUTATION, 1995, 121 (02) : 256 - 285
[10] Fundamental Technologies in Modern Speech Recognition
Furui, Sadaoki
Deng, Li
Gales, Mark
Ney, Hermann
Tokuda, Keiichi
[J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 16 - 17

← 1 2 3 4 →