Central limit theorems for stochastic gradient descent with averaging for stable manifolds*

被引：2

作者：

Dereich, Steffen ^{[1
]}

Kassing, Sebastian ^{[2
]}

机构：

[1] Univ Munster, Inst Math Stochast, Fac Math & Comp Sci, Munster, Germany

[2] Univ Bielefeld, Fac Math, Bielefeld, Germany

来源：

ELECTRONIC JOURNAL OF PROBABILITY | 2023年 / 28卷

关键词：

stochastic approximation; Robbins-Monro; Ruppert-Polyak average; deep learning; stable manifold; APPROXIMATION;

D O I：

10.1214/23-EJP947

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

In this article, we establish new central limit theorems for Ruppert-Polyak averaged stochastic gradient descent schemes. Compared to previous work we do not assume that convergence occurs to an isolated attractor but instead allow convergence to a stable manifold. On the stable manifold the target function is constant and the oscillations of the iterates in the tangential direction may be significantly larger than the ones in the normal direction. We still recover a central limit theorem for the averaged scheme in the normal direction with the same rates as in the case of isolated attractors. In the setting where the magnitude of the random perturbation is of constant order, our research covers step-sizes -yn = C gamma n-gamma with C gamma > 0 and -y is an element of (34, 1). In particular, we show that the beneficial effect of averaging prevails in more general situations.

引用

页数：48

共 50 条

[21] Comparing Stochastic Gradient Descent and Mini-batch Gradient Descent Algorithms in Loan Risk Assessment
Adigun, Abodunrin AbdulGafar
Yinka-Banjo, Chika
INFORMATICS AND INTELLIGENT APPLICATIONS, 2022, 1547 : 283 - 296
[22] Convergence analysis of distributed stochastic gradient descent with shuffling
Meng, Qi
Chen, Wei
Wang, Yue
Ma, Zhi-Ming
Liu, Tie-Yan
NEUROCOMPUTING, 2019, 337 : 46 - 57
[23] Stochastic Gradient Descent and Its Variants in Machine Learning
Netrapalli, Praneeth
JOURNAL OF THE INDIAN INSTITUTE OF SCIENCE, 2019, 99 (02) : 201 - 213
[24] Sign Based Derivative Filtering for Stochastic Gradient Descent
Berestizshevsky, Konstantin
Even, Guy
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: DEEP LEARNING, PT II, 2019, 11728 : 208 - 219
[25] Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent
Wang, Bao
Nguyen, Tan
Sun, Tao
Bertozzi, Andrea L.
Baraniuk, Richard G.
Osher, Stanley J.
SIAM JOURNAL ON IMAGING SCIENCES, 2022, 15 (02) : 738 - 761
[26] Guided parallelized stochastic gradient descent for delay compensation
Sharma, Anuraganand
APPLIED SOFT COMPUTING, 2021, 102
[27] ADINE: An Adaptive Momentum Method for Stochastic Gradient Descent
Srinivasan, Vishwak
Sankar, Adepu Ravi
Balasubramanian, Vineeth N.
PROCEEDINGS OF THE ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA (CODS-COMAD'18), 2018, : 249 - 256
[28] Distributed Stochastic Gradient Descent With Compressed and Skipped Communication
Phuong, Tran Thi
Phong, Le Trieu
Fukushima, Kazuhide
IEEE ACCESS, 2023, 11 : 99836 - 99846
[29] Online Covariance Matrix Estimation in Stochastic Gradient Descent
Zhu, Wanrong
Chen, Xi
Wu, Wei Biao
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (541) : 393 - 404
[30] STATISTICAL INFERENCE FOR MODEL PARAMETERS IN STOCHASTIC GRADIENT DESCENT
Chen, Xi
Lee, Jason D.
Tong, Xin T.
Zhang, Yichen
ANNALS OF STATISTICS, 2020, 48 (01) : 251 - 273

← 1 2 3 4 5 →