Asymptotic Convergence Rate of Dropout on Shallow Linear Neural Networks

被引：1

作者：

Senen-Cerda, Albert ^{[1
]}

Sanders, Jaron ^{[1
]}

机构：

[1] Eindhoven Univ Technol, Eindhoven, Netherlands

来源：

PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS | 2022年 / 6卷 / 02期

关键词：

Dropout; neural networks; convergence rate; gradient flow;

D O I：

10.1145/3530898

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We analyze the convergence rate of gradient flows on objective functions induced by Dropout and Dropconnect, when applying them to shallow linear Neural Networks (NNs)-which can also be viewed as doing matrix factorization using a particular regularizer. Dropout algorithms such as these are thus regularization techniques that use {0, 1}-valued random variables to filter weights during training in order to avoid coadaptation of features. By leveraging a recent result on nonconvex optimization and conducting a careful analysis of the set of minimizers as well as the Hessian of the loss function, we are able to obtain (i) a local convergence proof of the gradient flow and (ii) a bound on the convergence rate that depends on the data, the dropout probability, and the width of the NN. Finally, we compare this theoretical bound to numerical simulations, which are in qualitative agreement with the convergence bound and match it when starting sufficiently close to a minimizer.

引用

页数：53

共 50 条

[31] The emergence of a concept in shallow neural networks
Agliari, Elena
Alemanno, Francesco
Barra, Adriano
De Marzo, Giordano
NEURAL NETWORKS, 2022, 148 : 232 - 253
[32] Global exponential stability for delayed cellular neural networks and estimate of exponential convergence rate
Zhang Qiang
Advanced Design Technology Center
School of Management
JournalofSystemsEngineeringandElectronics, 2004, (03) : 344 - 349
[33] Multistability and convergence in delayed neural networks
Cheng, Chang-Yuan
Lin, Kuang-Hui
Shih, Chih-Wen
PHYSICA D-NONLINEAR PHENOMENA, 2007, 225 (01) : 61 - 74
[34] Correlation-based structural dropout for convolutional neural networks
Zeng, Yuyuan
Dai, Tao
Chen, Bin
Xia, Shu-Tao
Lu, Jian
PATTERN RECOGNITION, 2021, 120
[35] Deep Learning Convolutional Neural Networks with Dropout - a Parallel Approach
Shen, Jingyi
Shafiq, M. Omair
2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 572 - 577
[36] Contextual Soft Dropout Method in Training of Artificial Neural Networks
Tu Nga Ly
Kern, Rafal
Pathak, Khanindra
Wolk, Krzysztof
Burnell, Erik Dawid
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2021, 2021, 12672 : 692 - 703
[37] Data Dropout: Optimizing Training Data for Convolutional Neural Networks
Wang, Tianyang
Huan, Jun
Li, Bo
2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 39 - 46
[38] IMPROVING DEEP NEURAL NETWORKS BY USING SPARSE DROPOUT STRATEGY
Zheng, Hao
Chen, Mingming
Liu, Wenju
Yang, Zhanlei
Liang, Shan
2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 21 - 26
[39] Dropout: A Simple Way to Prevent Neural Networks from Overfitting
Srivastava, Nitish
Hinton, Geoffrey
Krizhevsky, Alex
Sutskever, Ilya
Salakhutdinov, Ruslan
JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 1929 - 1958
[40] CONVERGENCE TIME ON THE RS MODEL FOR NEURAL NETWORKS
Penna, T. J. P.
de Oliveira, P. M. C.
Arenzon, J. J.
de Almeida, R. M. C.
Iglesias, J. R.
INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 1991, 2 (03): : 711 - 717

← 1 2 3 4 5 →