Central limit theorems for stochastic gradient descent with averaging for stable manifolds*

被引：2

作者：

Dereich, Steffen ^{[1
]}

Kassing, Sebastian ^{[2
]}

机构：

[1] Univ Munster, Inst Math Stochast, Fac Math & Comp Sci, Munster, Germany

[2] Univ Bielefeld, Fac Math, Bielefeld, Germany

来源：

ELECTRONIC JOURNAL OF PROBABILITY | 2023年 / 28卷

关键词：

stochastic approximation; Robbins-Monro; Ruppert-Polyak average; deep learning; stable manifold; APPROXIMATION;

D O I：

10.1214/23-EJP947

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

In this article, we establish new central limit theorems for Ruppert-Polyak averaged stochastic gradient descent schemes. Compared to previous work we do not assume that convergence occurs to an isolated attractor but instead allow convergence to a stable manifold. On the stable manifold the target function is constant and the oscillations of the iterates in the tangential direction may be significantly larger than the ones in the normal direction. We still recover a central limit theorem for the averaged scheme in the normal direction with the same rates as in the case of isolated attractors. In the setting where the magnitude of the random perturbation is of constant order, our research covers step-sizes -yn = C gamma n-gamma with C gamma > 0 and -y is an element of (34, 1). In particular, we show that the beneficial effect of averaging prevails in more general situations.

引用

页数：48

共 50 条

[11] On the regularizing property of stochastic gradient descent
Jin, Bangti
Lu, Xiliang
INVERSE PROBLEMS, 2019, 35 (01)
[12] Efficiency Ordering of Stochastic Gradient Descent
Hu, Jie
Doshi, Vishwaraj
Eun, Do Young
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[13] Strong error analysis for stochastic gradient descent optimization algorithms
Jentzen, Arnulf
Kuckuck, Benno
Neufeld, Ariel
von Wurstemberger, Philippe
IMA JOURNAL OF NUMERICAL ANALYSIS, 2021, 41 (01) : 455 - 492
[14] STOCHASTIC GRADIENT DESCENT WITH FINITE SAMPLES SIZES
Yuan, Kun
Ying, Bicheng
Vlaski, Stefan
Sayed, Ali H.
2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,
[15] Towards stability and optimality in stochastic gradient descent
Toulis, Panos
Tran, Dustin
Airoldi, Edoardo M.
ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 1290 - 1298
[16] ON STOCHASTIC SUBGRADIENT MIRROR-DESCENT ALGORITHM WITH WEIGHTED AVERAGING
Nedic, Angelia
Lee, Soomin
SIAM JOURNAL ON OPTIMIZATION, 2014, 24 (01) : 84 - 107
[17] Pipelined Stochastic Gradient Descent with Taylor Expansion
Jang, Bongwon
Yoo, Inchul
Yook, Dongsuk
APPLIED SCIENCES-BASEL, 2023, 13 (21):
[18] Analysis of stochastic gradient descent in continuous time
Latz, Jonas
STATISTICS AND COMPUTING, 2021, 31 (04)
[19] Online inference with debiased stochastic gradient descent
Han, Ruijian
Luo, Lan
Lin, Yuanyuan
Huang, Jian
BIOMETRIKA, 2024, 111 (01) : 93 - 108
[20] Stochastic Gradient Descent Meets Distribution Regression
Muecke, Nicole
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130

← 1 2 3 4 5 →