Central limit theorems for stochastic gradient descent with averaging for stable manifolds*

被引：2

作者：

Dereich, Steffen ^{[1
]}

Kassing, Sebastian ^{[2
]}

机构：

[1] Univ Munster, Inst Math Stochast, Fac Math & Comp Sci, Munster, Germany

[2] Univ Bielefeld, Fac Math, Bielefeld, Germany

来源：

ELECTRONIC JOURNAL OF PROBABILITY | 2023年 / 28卷

关键词：

stochastic approximation; Robbins-Monro; Ruppert-Polyak average; deep learning; stable manifold; APPROXIMATION;

D O I：

10.1214/23-EJP947

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

In this article, we establish new central limit theorems for Ruppert-Polyak averaged stochastic gradient descent schemes. Compared to previous work we do not assume that convergence occurs to an isolated attractor but instead allow convergence to a stable manifold. On the stable manifold the target function is constant and the oscillations of the iterates in the tangential direction may be significantly larger than the ones in the normal direction. We still recover a central limit theorem for the averaged scheme in the normal direction with the same rates as in the case of isolated attractors. In the setting where the magnitude of the random perturbation is of constant order, our research covers step-sizes -yn = C gamma n-gamma with C gamma > 0 and -y is an element of (34, 1). In particular, we show that the beneficial effect of averaging prevails in more general situations.

引用

页数：48

共 50 条

[31] Drivetrain System Identification in a Multi-Task Learning Strategy using Partial Asynchronous Elastic Averaging Stochastic Gradient Descent
Staessens, Tom
Crevecoeur, Guillaume
2020 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM), 2020, : 1549 - 1554
[32] Stochastic gradient descent for semilinear elliptic equations with uncertainties
Wang, Ting
Knap, Jaroslaw
JOURNAL OF COMPUTATIONAL PHYSICS, 2021, 426
[33] Recent Advances in Stochastic Gradient Descent in Deep Learning
Tian, Yingjie
Zhang, Yuqi
Zhang, Haibin
MATHEMATICS, 2023, 11 (03)
[34] Adaptivity of Averaged Stochastic Gradient Descent to Local Strong Convexity for Logistic Regression
Bach, Francis
JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 595 - 627
[35] Why random reshuffling beats stochastic gradient descent
Gurbuzbalaban, M.
Ozdaglar, A.
Parrilo, P. A.
MATHEMATICAL PROGRAMMING, 2021, 186 (1-2) : 49 - 84
[36] Central limit theorems for local network statistics
Maugis, P. A.
BIOMETRIKA, 2024, 111 (03) : 743 - 754
[37] CONTROLLING STOCHASTIC GRADIENT DESCENT USING STOCHASTIC APPROXIMATION FOR ROBUST DISTRIBUTED OPTIMIZATION
Jain, Adit
Krishnamurthy, Vikram
NUMERICAL ALGEBRA CONTROL AND OPTIMIZATION, 2025, 15 (01): : 173 - 195
[38] Functional Central Limit Theorem and Strong Law of Large Numbers for Stochastic Gradient Langevin Dynamics
Lovas, A.
Rasonyi, M.
APPLIED MATHEMATICS AND OPTIMIZATION, 2023, 88 (03)
[39] Tight analyses for non-smooth stochastic gradient descent
Harvey, Nicholas J. A.
Liaw, Christopher
Plan, Yaniv
Randhawa, Sikander
CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
[40] On the Saturation Phenomenon of Stochastic Gradient Descent for Linear Inverse Problems*
Jin, Bangti
Zhou, Zehui
Zou, Jun
SIAM-ASA JOURNAL ON UNCERTAINTY QUANTIFICATION, 2021, 9 (04): : 1553 - 1588

← 1 2 3 4 5 →