Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers

被引:6
|
作者
Paquin, Alexandre Lemire [1 ]
Chaib-draa, Brahim [1 ]
Giguere, Philippe [1 ]
机构
[1] Laval Univ, Dept Comp Sci & Software Engn, Pavillon Adrien Pouliot 1065,Ave Med, Quebec City, PQ G1V 0A6, Canada
关键词
Generalization; Deep learning; Stochastic gradient descent; Stability;
D O I
10.1016/j.neunet.2023.04.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We prove new generalization bounds for stochastic gradient descent when training classifiers with invariances. Our analysis is based on the stability framework and covers both the convex case of linear classifiers and the non-convex case of homogeneous neural networks. We analyze stability with respect to the normalized version of the loss function used for training. This leads to investigating a form of angle-wise stability instead of euclidean stability in weights. For neural networks, the measure of distance we consider is invariant to rescaling the weights of each layer. Furthermore, we exploit the notion of on-average stability in order to obtain a data-dependent quantity in the bound. This data-dependent quantity is seen to be more favorable when training with larger learning rates in our numerical experiments. This might help to shed some light on why larger learning rates can lead to better generalization in some practical scenarios.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:382 / 394
页数:13
相关论文
共 50 条
  • [11] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Zhou, Bai-cun
    Han, Cong-ying
    Guo, Tian-de
    ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2021, 37 (01): : 126 - 136
  • [12] Stability for the training of deep neural networks and other classifiers
    Berlyand, Leonid
    Jabin, Pierre-Emmanuel
    Safsten, C. Alex
    MATHEMATICAL MODELS & METHODS IN APPLIED SCIENCES, 2021, 31 (11): : 2345 - 2390
  • [13] Fractional-order stochastic gradient descent method with momentum and energy for deep neural networks
    Zhou, Xingwen
    You, Zhenghao
    Sun, Weiguo
    Zhao, Dongdong
    Yan, Shi
    NEURAL NETWORKS, 2025, 181
  • [14] Efficient Optimization of Neural Networks for Predictive Hiring: An In-Depth Approach to Stochastic Gradient Descent
    Temsamani, Yassine Khallouk
    Achchab, Said
    2024 5TH INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKS AND INTERNET OF THINGS, CNIOT 2024, 2024, : 588 - 594
  • [15] Error Analysis of Stochastic Gradient Descent Ranking
    Chen, Hong
    Tang, Yi
    Li, Luoqing
    Yuan, Yuan
    Li, Xuelong
    Tang, Yuanyan
    IEEE TRANSACTIONS ON CYBERNETICS, 2013, 43 (03) : 898 - 909
  • [16] On the Saturation Phenomenon of Stochastic Gradient Descent for Linear Inverse Problems*
    Jin, Bangti
    Zhou, Zehui
    Zou, Jun
    SIAM-ASA JOURNAL ON UNCERTAINTY QUANTIFICATION, 2021, 9 (04): : 1553 - 1588
  • [17] STADIA: Photonic Stochastic Gradient Descent for Neural Network Accelerators
    Xia, Chengpeng
    Chen, Yawen
    Zhang, Haibo
    Wu, Jigang
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (05)
  • [18] Generalization Guarantees of Gradient Descent for Shallow Neural Networks
    Wang, Puyu
    Lei, Yunwen
    Wang, Di
    Ying, Yiming
    Zhou, Ding-Xuan
    NEURAL COMPUTATION, 2025, 37 (02) : 344 - 402
  • [19] Exponential Convergence Time of Gradient Descent for One-Dimensional Deep Linear Neural Networks
    Shamir, Ohad
    CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [20] Robust decentralized stochastic gradient descent over unstable networks
    Zheng, Yanwei
    Zhang, Liangxu
    Chen, Shuzhen
    Zhang, Xiao
    Cai, Zhipeng
    Cheng, Xiuzhen
    COMPUTER COMMUNICATIONS, 2023, 203 : 163 - 179