Asymptotic Convergence Rate of Dropout on Shallow Linear Neural Networks

被引:1
|
作者
Senen-Cerda, Albert [1 ]
Sanders, Jaron [1 ]
机构
[1] Eindhoven Univ Technol, Eindhoven, Netherlands
关键词
Dropout; neural networks; convergence rate; gradient flow;
D O I
10.1145/3530898
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We analyze the convergence rate of gradient flows on objective functions induced by Dropout and Dropconnect, when applying them to shallow linear Neural Networks (NNs)-which can also be viewed as doing matrix factorization using a particular regularizer. Dropout algorithms such as these are thus regularization techniques that use {0, 1}-valued random variables to filter weights during training in order to avoid coadaptation of features. By leveraging a recent result on nonconvex optimization and conducting a careful analysis of the set of minimizers as well as the Hessian of the loss function, we are able to obtain (i) a local convergence proof of the gradient flow and (ii) a bound on the convergence rate that depends on the data, the dropout probability, and the width of the NN. Finally, we compare this theoretical bound to numerical simulations, which are in qualitative agreement with the convergence bound and match it when starting sufficiently close to a minimizer.
引用
收藏
页数:53
相关论文
共 50 条
  • [31] The emergence of a concept in shallow neural networks
    Agliari, Elena
    Alemanno, Francesco
    Barra, Adriano
    De Marzo, Giordano
    NEURAL NETWORKS, 2022, 148 : 232 - 253
  • [32] Global exponential stability for delayed cellular neural networks and estimate of exponential convergence rate
    Zhang Qiang
    Advanced Design Technology Center
    School of Management
    JournalofSystemsEngineeringandElectronics, 2004, (03) : 344 - 349
  • [33] Multistability and convergence in delayed neural networks
    Cheng, Chang-Yuan
    Lin, Kuang-Hui
    Shih, Chih-Wen
    PHYSICA D-NONLINEAR PHENOMENA, 2007, 225 (01) : 61 - 74
  • [34] Correlation-based structural dropout for convolutional neural networks
    Zeng, Yuyuan
    Dai, Tao
    Chen, Bin
    Xia, Shu-Tao
    Lu, Jian
    PATTERN RECOGNITION, 2021, 120
  • [35] Deep Learning Convolutional Neural Networks with Dropout - a Parallel Approach
    Shen, Jingyi
    Shafiq, M. Omair
    2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 572 - 577
  • [36] Contextual Soft Dropout Method in Training of Artificial Neural Networks
    Tu Nga Ly
    Kern, Rafal
    Pathak, Khanindra
    Wolk, Krzysztof
    Burnell, Erik Dawid
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2021, 2021, 12672 : 692 - 703
  • [37] Data Dropout: Optimizing Training Data for Convolutional Neural Networks
    Wang, Tianyang
    Huan, Jun
    Li, Bo
    2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 39 - 46
  • [38] IMPROVING DEEP NEURAL NETWORKS BY USING SPARSE DROPOUT STRATEGY
    Zheng, Hao
    Chen, Mingming
    Liu, Wenju
    Yang, Zhanlei
    Liang, Shan
    2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 21 - 26
  • [39] Dropout: A Simple Way to Prevent Neural Networks from Overfitting
    Srivastava, Nitish
    Hinton, Geoffrey
    Krizhevsky, Alex
    Sutskever, Ilya
    Salakhutdinov, Ruslan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 1929 - 1958
  • [40] CONVERGENCE TIME ON THE RS MODEL FOR NEURAL NETWORKS
    Penna, T. J. P.
    de Oliveira, P. M. C.
    Arenzon, J. J.
    de Almeida, R. M. C.
    Iglesias, J. R.
    INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 1991, 2 (03): : 711 - 717