Stability analysis of stochastic gradient descent for homogeneous neural networks and linear classifiers

被引：6

作者：

Paquin, Alexandre Lemire ^{[1
]}

Chaib-draa, Brahim ^{[1
]}

Giguere, Philippe ^{[1
]}

机构：

[1] Laval Univ, Dept Comp Sci & Software Engn, Pavillon Adrien Pouliot 1065,Ave Med, Quebec City, PQ G1V 0A6, Canada

来源：

NEURAL NETWORKS | 2023年 / 164卷

关键词：

Generalization; Deep learning; Stochastic gradient descent; Stability;

D O I：

10.1016/j.neunet.2023.04.028

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We prove new generalization bounds for stochastic gradient descent when training classifiers with invariances. Our analysis is based on the stability framework and covers both the convex case of linear classifiers and the non-convex case of homogeneous neural networks. We analyze stability with respect to the normalized version of the loss function used for training. This leads to investigating a form of angle-wise stability instead of euclidean stability in weights. For neural networks, the measure of distance we consider is invariant to rescaling the weights of each layer. Furthermore, we exploit the notion of on-average stability in order to obtain a data-dependent quantity in the bound. This data-dependent quantity is seen to be more favorable when training with larger learning rates in our numerical experiments. This might help to shed some light on why larger learning rates can lead to better generalization in some practical scenarios.(c) 2023 Elsevier Ltd. All rights reserved.

引用

页码：382 / 394

页数：13

共 50 条

[41] Stability of linear stochastic 2-D homogeneous systems
Liu, Shutang
Zhang, Yongping
Li, Wei
APPLIED MATHEMATICS AND COMPUTATION, 2015, 261 : 419 - 430
[42] Block-cyclic stochastic coordinate descent for deep neural networks
Nakamura, Kensuke
Soatto, Stefano
Hong, Byung-Woo
NEURAL NETWORKS, 2021, 139 : 348 - 357
[43] Stochastic Gradient Descent Combines Second-Order Information for Training Neural Network
Chen, Minyu
ICOMS 2018: 2018 INTERNATIONAL CONFERENCE ON MATHEMATICS AND STATISTICS, 2018, : 69 - 73
[44] Stochastic Gradient Descent with Noise of Machine Learning Type Part II: Continuous Time Analysis
Wojtowytsch S.
Journal of Nonlinear Science, 2024, 34 (01)
[45] Stochastic Gradient Descent with Noise of Machine Learning Type Part I: Discrete Time Analysis
Wojtowytsch, Stephan
JOURNAL OF NONLINEAR SCIENCE, 2023, 33 (03)
[46] Stochastic Gradient Descent with Noise of Machine Learning Type Part I: Discrete Time Analysis
Stephan Wojtowytsch
Journal of Nonlinear Science, 2023, 33
[47] Stability Analysis of Time-Delay Neural Networks Subject to Stochastic Perturbations
Chen, Yun
Zheng, Wei Xing
IEEE TRANSACTIONS ON CYBERNETICS, 2013, 43 (06) : 2122 - 2134
[48] A New Approach to Stability Analysis for Stochastic Hopfield Neural Networks With Time Delays
Lv, Xiang
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (10) : 5278 - 5288
[49] Memristor-Based Multilayer Neural Networks With Online Gradient Descent Training
Soudry, Daniel
Di Castro, Dotan
Gal, Asaf
Kolodny, Avinoam
Kvatinsky, Shahar
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (10) : 2408 - 2421
[50] Adaptive Stochastic Conjugate Gradient Optimization for Backpropagation Neural Networks
Hashem, Ibrahim Abaker Targio
Alaba, Fadele Ayotunde
Jumare, Muhammad Haruna
Ibrahim, Ashraf Osman
Abulfaraj, Anas Waleed
IEEE ACCESS, 2024, 12 : 33757 - 33768

← 1 2 3 4 5 →