Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers

被引:6
|
作者
Paquin, Alexandre Lemire [1 ]
Chaib-draa, Brahim [1 ]
Giguere, Philippe [1 ]
机构
[1] Laval Univ, Dept Comp Sci & Software Engn, Pavillon Adrien Pouliot 1065,Ave Med, Quebec City, PQ G1V 0A6, Canada
关键词
Generalization; Deep learning; Stochastic gradient descent; Stability;
D O I
10.1016/j.neunet.2023.04.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We prove new generalization bounds for stochastic gradient descent when training classifiers with invariances. Our analysis is based on the stability framework and covers both the convex case of linear classifiers and the non-convex case of homogeneous neural networks. We analyze stability with respect to the normalized version of the loss function used for training. This leads to investigating a form of angle-wise stability instead of euclidean stability in weights. For neural networks, the measure of distance we consider is invariant to rescaling the weights of each layer. Furthermore, we exploit the notion of on-average stability in order to obtain a data-dependent quantity in the bound. This data-dependent quantity is seen to be more favorable when training with larger learning rates in our numerical experiments. This might help to shed some light on why larger learning rates can lead to better generalization in some practical scenarios.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:382 / 394
页数:13
相关论文
共 50 条
  • [1] Convergence of gradient descent for learning linear neural networks
    Nguegnang, Gabin Maxime
    Rauhut, Holger
    Terstiege, Ulrich
    ADVANCES IN CONTINUOUS AND DISCRETE MODELS, 2024, 2024 (01):
  • [2] Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation
    Jentzen, Arnulf
    Welti, Timo
    APPLIED MATHEMATICS AND COMPUTATION, 2023, 455
  • [3] Stability and optimization error of stochastic gradient descent for pairwise learning
    Shen, Wei
    Yang, Zhenhuan
    Ying, Yiming
    Yuan, Xiaoming
    ANALYSIS AND APPLICATIONS, 2020, 18 (05) : 887 - 927
  • [4] Damped Newton Stochastic Gradient Descent Method for Neural Networks Training
    Zhou, Jingcheng
    Wei, Wei
    Zhang, Ruizhi
    Zheng, Zhiming
    MATHEMATICS, 2021, 9 (13)
  • [5] Stochastic Markov gradient descent and training low-bit neural networks
    Ashbrock, Jonathan
    Powell, Alexander M.
    SAMPLING THEORY SIGNAL PROCESSING AND DATA ANALYSIS, 2021, 19 (02):
  • [6] Simple Evolutionary Optimization Can Rival Stochastic Gradient Descent in Neural Networks
    Morse, Gregory
    Stanley, Kenneth O.
    GECCO'16: PROCEEDINGS OF THE 2016 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2016, : 477 - 484
  • [7] INVERSION OF NEURAL NETWORKS BY GRADIENT DESCENT
    KINDERMANN, J
    LINDEN, A
    PARALLEL COMPUTING, 1990, 14 (03) : 277 - 286
  • [8] Convergence analysis of distributed stochastic gradient descent with shuffling
    Meng, Qi
    Chen, Wei
    Wang, Yue
    Ma, Zhi-Ming
    Liu, Tie-Yan
    NEUROCOMPUTING, 2019, 337 : 46 - 57
  • [9] A Comparative Analysis of Gradient Descent-Based Optimization Algorithms on Convolutional Neural Networks
    Dogo, E. M.
    Afolabi, O. J.
    Nwulu, N. I.
    Twala, B.
    Aigbavboa, C. O.
    PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON COMPUTATIONAL TECHNIQUES, ELECTRONICS AND MECHANICAL SYSTEMS (CTEMS), 2018, : 92 - 99
  • [10] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Bai-cun Zhou
    Cong-ying Han
    Tian-de Guo
    Acta Mathematicae Applicatae Sinica, English Series, 2021, 37 : 126 - 136