Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers

被引:6
|
作者
Paquin, Alexandre Lemire [1 ]
Chaib-draa, Brahim [1 ]
Giguere, Philippe [1 ]
机构
[1] Laval Univ, Dept Comp Sci & Software Engn, Pavillon Adrien Pouliot 1065,Ave Med, Quebec City, PQ G1V 0A6, Canada
关键词
Generalization; Deep learning; Stochastic gradient descent; Stability;
D O I
10.1016/j.neunet.2023.04.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We prove new generalization bounds for stochastic gradient descent when training classifiers with invariances. Our analysis is based on the stability framework and covers both the convex case of linear classifiers and the non-convex case of homogeneous neural networks. We analyze stability with respect to the normalized version of the loss function used for training. This leads to investigating a form of angle-wise stability instead of euclidean stability in weights. For neural networks, the measure of distance we consider is invariant to rescaling the weights of each layer. Furthermore, we exploit the notion of on-average stability in order to obtain a data-dependent quantity in the bound. This data-dependent quantity is seen to be more favorable when training with larger learning rates in our numerical experiments. This might help to shed some light on why larger learning rates can lead to better generalization in some practical scenarios.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:382 / 394
页数:13
相关论文
共 50 条
  • [31] A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions
    Jentzen, Arnulf
    Riekert, Adrian
    ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND PHYSIK, 2022, 73 (05):
  • [32] Distributed stochastic gradient descent for link prediction in signed social networks
    Zhang, Han
    Wu, Gang
    Ling, Qing
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2019, 2019 (1)
  • [33] An Efficient Stochastic Gradient Descent Algorithm to Maximize the Coverage of Cellular Networks
    Liu, Yaxi
    Wei Huangfu
    Zhang, Haijun
    Long, Keping
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2019, 18 (07) : 3424 - 3436
  • [34] Distributed stochastic gradient descent for link prediction in signed social networks
    Han Zhang
    Gang Wu
    Qing Ling
    EURASIP Journal on Advances in Signal Processing, 2019
  • [35] ANALYSIS OF KINETIC MODELS FOR LABEL SWITCHING AND STOCHASTIC GRADIENT DESCENT
    Burger, Martin
    Rossi, Alex
    KINETIC AND RELATED MODELS, 2023, 16 (05) : 717 - 747
  • [36] A Tight Convergence Analysis for Stochastic Gradient Descent with Delayed Updates
    Arjevani, Yossi
    Shamir, Ohad
    Srebro, Nathan
    ALGORITHMIC LEARNING THEORY, VOL 117, 2020, 117 : 111 - 132
  • [37] Strong error analysis for stochastic gradient descent optimization algorithms
    Jentzen, Arnulf
    Kuckuck, Benno
    Neufeld, Ariel
    von Wurstemberger, Philippe
    IMA JOURNAL OF NUMERICAL ANALYSIS, 2021, 41 (01) : 455 - 492
  • [38] Forecasting the productivity of a solar distiller enhanced with an inclined absorber plate using stochastic gradient descent in artificial neural networks
    Mohammed, Suha A.
    Al-Haddad, Luttfi A.
    Alawee, Wissam H.
    Dhahad, Hayder A.
    Jaber, Alaa Abdulhady
    Al-Haddad, Sinan A.
    MULTISCALE AND MULTIDISCIPLINARY MODELING EXPERIMENTS AND DESIGN, 2024, 7 (03) : 1819 - 1829
  • [39] Stability Analysis in a Class of Markov Switched Stochastic Hopfield Neural Networks
    Feng, Lichao
    Cao, Jinde
    Liu, Lei
    NEURAL PROCESSING LETTERS, 2019, 50 (01) : 413 - 430
  • [40] Stability Analysis in a Class of Markov Switched Stochastic Hopfield Neural Networks
    Lichao Feng
    Jinde Cao
    Lei Liu
    Neural Processing Letters, 2019, 50 : 413 - 430