Central limit theorems for stochastic gradient descent with averaging for stable manifolds*

被引:2
|
作者
Dereich, Steffen [1 ]
Kassing, Sebastian [2 ]
机构
[1] Univ Munster, Inst Math Stochast, Fac Math & Comp Sci, Munster, Germany
[2] Univ Bielefeld, Fac Math, Bielefeld, Germany
来源
ELECTRONIC JOURNAL OF PROBABILITY | 2023年 / 28卷
关键词
stochastic approximation; Robbins-Monro; Ruppert-Polyak average; deep learning; stable manifold; APPROXIMATION;
D O I
10.1214/23-EJP947
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article, we establish new central limit theorems for Ruppert-Polyak averaged stochastic gradient descent schemes. Compared to previous work we do not assume that convergence occurs to an isolated attractor but instead allow convergence to a stable manifold. On the stable manifold the target function is constant and the oscillations of the iterates in the tangential direction may be significantly larger than the ones in the normal direction. We still recover a central limit theorem for the averaged scheme in the normal direction with the same rates as in the case of isolated attractors. In the setting where the magnitude of the random perturbation is of constant order, our research covers step-sizes -yn = C gamma n-gamma with C gamma > 0 and -y is an element of (34, 1). In particular, we show that the beneficial effect of averaging prevails in more general situations.
引用
收藏
页数:48
相关论文
共 50 条
  • [31] Drivetrain System Identification in a Multi-Task Learning Strategy using Partial Asynchronous Elastic Averaging Stochastic Gradient Descent
    Staessens, Tom
    Crevecoeur, Guillaume
    2020 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM), 2020, : 1549 - 1554
  • [32] Stochastic gradient descent for semilinear elliptic equations with uncertainties
    Wang, Ting
    Knap, Jaroslaw
    JOURNAL OF COMPUTATIONAL PHYSICS, 2021, 426
  • [33] Recent Advances in Stochastic Gradient Descent in Deep Learning
    Tian, Yingjie
    Zhang, Yuqi
    Zhang, Haibin
    MATHEMATICS, 2023, 11 (03)
  • [34] Adaptivity of Averaged Stochastic Gradient Descent to Local Strong Convexity for Logistic Regression
    Bach, Francis
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 595 - 627
  • [35] Why random reshuffling beats stochastic gradient descent
    Gurbuzbalaban, M.
    Ozdaglar, A.
    Parrilo, P. A.
    MATHEMATICAL PROGRAMMING, 2021, 186 (1-2) : 49 - 84
  • [36] Central limit theorems for local network statistics
    Maugis, P. A.
    BIOMETRIKA, 2024, 111 (03) : 743 - 754
  • [37] CONTROLLING STOCHASTIC GRADIENT DESCENT USING STOCHASTIC APPROXIMATION FOR ROBUST DISTRIBUTED OPTIMIZATION
    Jain, Adit
    Krishnamurthy, Vikram
    NUMERICAL ALGEBRA CONTROL AND OPTIMIZATION, 2025, 15 (01): : 173 - 195
  • [38] Functional Central Limit Theorem and Strong Law of Large Numbers for Stochastic Gradient Langevin Dynamics
    Lovas, A.
    Rasonyi, M.
    APPLIED MATHEMATICS AND OPTIMIZATION, 2023, 88 (03)
  • [39] Tight analyses for non-smooth stochastic gradient descent
    Harvey, Nicholas J. A.
    Liaw, Christopher
    Plan, Yaniv
    Randhawa, Sikander
    CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [40] On the Saturation Phenomenon of Stochastic Gradient Descent for Linear Inverse Problems*
    Jin, Bangti
    Zhou, Zehui
    Zou, Jun
    SIAM-ASA JOURNAL ON UNCERTAINTY QUANTIFICATION, 2021, 9 (04): : 1553 - 1588