An adaptive Drop method for deep neural networks regularization: Estimation of DropConnect hyperparameter using generalization gap

被引:27
作者
Hssayni, El Houssaine [1 ]
Joudar, Nour-Eddine [2 ]
Ettaouil, Mohamed [1 ]
机构
[1] Univ Sidi Mohamed Ben Abdellah, Fac Sci & Tech, Dept Math, Lab Modeling & Math Struct, Fes, Morocco
[2] Mohammed V Univ Rabat, Res Ctr STIS, Dept Appl Math & Informat, ENSAM,M2CS, Rabat, Morocco
关键词
Deep neural networks; DropConnect; Regularization; Rademacher complexity; Generalization gap;
D O I
10.1016/j.knosys.2022.109567
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNNs) have solved numerous challenging reels problems. However, successful DNNs often require a large number of parameters, which may produce some undesirable phenomena, notably overfitting. DropConnect, which is a generalization of Dropout, is one of the successful stochastic regularizers that prevent overfitting in deep neural networks. Indeed, it allows dropping some parameters according to a fixed probability, generating at the end dynamic sparse DNNs. The study of the DropConnect hyperparameter is still not estimated and needs a theoretical understanding. In this context, we propose an estimation of the DropConnect hyperparameter using the gap generalization and the Rademacher complexity. This estimation gives rise to a new DropConnect technique named Adaptive DropConnect (A-DropConnect), in which the studied hyperparameter is updated using the data-dependent during training. Efficiency of A-DropConnect is demonstrated by several experiments. Numerical results demonstrate that the proposed method yields significant improvement in classification performance compared to the state of the art. (C) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:10
相关论文
共 32 条
[1]   The dropout learning algorithm [J].
Baldi, Pierre ;
Sadowski, Peter .
ARTIFICIAL INTELLIGENCE, 2014, 210 :78-122
[2]  
Bartlett P. L., 2003, Journal of Machine Learning Research, V3, P463, DOI 10.1162/153244303321897690
[3]  
Gal Y, 2016, PR MACH LEARN RES, V48
[4]   Dropout Rademacher complexity of deep neural networks [J].
Gao, Wei ;
Zhou, Zhi-Hua .
SCIENCE CHINA-INFORMATION SCIENCES, 2016, 59 (07)
[5]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[6]   KRR-CNN: kernels redundancy reduction in convolutional neural networks [J].
Hssayni, El Houssaine ;
Joudar, Nour-Eddine ;
Ettaouil, Mohamed .
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (03) :2443-2454
[7]   DropELM: Fast neural network regularization with Dropout and DropConnect [J].
Iosifidis, Alexandros ;
Tefas, Anastasios ;
Pitas, Ioannis .
NEUROCOMPUTING, 2015, 162 :57-66
[8]  
Kawaguchi K, 2020, Arxiv, DOI [arXiv:1710.05468, DOI 10.48550/ARXIV.1710.05468]
[9]  
Kingma DP, 2014, ADV NEUR IN, V27
[10]  
Koltchinskii V, 2000, PROG PROBAB, V47, P443