On Almost Sure Convergence Rates of Stochastic Gradient Methods

被引：0

作者：

Liu, Jun ^{[1
]}

Yuan, Ye ^{[2
,3
]}

机构：

[1] Univ Waterloo, Dept Appl Math, Waterloo, ON, Canada

[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan, Peoples R China

[3] Huazhong Univ Sci & Technol, Sch Mech Sci & Engn, Wuhan, Peoples R China

来源：

CONFERENCE ON LEARNING THEORY, VOL 178 | 2022年 / 178卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Stochastic gradient descent; stochastic heavy-ball; stochastic Nesterov's accelerated gradient; almost sure convergence rate; OPTIMIZATION; BOUNDS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The vast majority of convergence rates analysis for stochastic gradient methods in the literature focus on convergence in expectation, whereas trajectory-wise almost sure convergence is clearly important to ensure that any instantiation of the stochastic algorithms would converge with probability one. Here we provide a unified almost sure convergence rates analysis for stochastic gradient descent (SGD), stochastic heavy-ball (SHB), and stochastic Nesterov's accelerated gradient (SNAG) methods. We show, for the first time, that the almost sure convergence rates obtained for these stochastic gradient methods on strongly convex functions, are arbitrarily close to their optimal convergence rates possible. For non-convex objective functions, we not only show that a weighted average of the squared gradient norms converges to zero almost surely, but also the last iterates of the algorithms. We further provide last-iterate almost sure convergence rates analysis for stochastic gradient methods on general convex smooth functions, in contrast with most existing results in the literature that only provide convergence in expectation for a weighted average of the iterates.

引用

页数：21

共 50 条

[1] Almost Sure Convergence Rates Analysis and Saddle Avoidance of Stochastic Gradient Methods
Liu, Jun
Yuan, Ye
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 40
[2] ON THE CONVERGENCE OF MOMENTS IN THE ALMOST SURE CENTRAL LIMIT THEOREM FOR STOCHASTIC APPROXIMATION ALGORITHMS
Cenac, Peggy
ESAIM-PROBABILITY AND STATISTICS, 2013, 17 : 179 - 194
[3] On the Convergence Rates of Policy Gradient Methods
Xiao, Lin
JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
[4] On the Almost Sure Convergence Rates for Pairwise Negative Quadrant Dependent Random Variables
Xing, G. -D.
THAI JOURNAL OF MATHEMATICS, 2010, 8 (01): : 171 - 184
[5] Convergence Rates of Distributed Gradient Methods Under Random Quantization: A Stochastic Approximation Approach
Doan, Thinh T.
Maguluri, Siva Theja
Romberg, Justin
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (10) : 4469 - 4484
[6] Almost sure convergence rates for system identification using binary, quantized, and regular sensors
Mei, Hongwei
Wang, Le Yi
Yin, George
AUTOMATICA, 2014, 50 (08) : 2120 - 2127
[7] On the rates of convergence of parallelized averaged stochastic gradient algorithms
Godichon-Baggioni, Antoine
Saadane, Sofiane
STATISTICS, 2020, 54 (03) : 618 - 635
[8] Almost Sure Convergence and Non-Asymptotic Concentration Bounds for Stochastic Mirror Descent Algorithm
Paul, Anik Kumar
Mahindrakar, Arun D.
Kalaimani, Rachel K.
IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 2397 - 2402
[9] Modified stochastic gradient identification algorithms with fast convergence rates
Chen, Jing
Ding, Feng
JOURNAL OF VIBRATION AND CONTROL, 2011, 17 (09) : 1281 - 1286
[10] Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates
Vaswani, Sharan
Mishkin, Aaron
Laradji, Issam
Schmidt, Mark
Gidel, Gauthier
Lacoste-Julien, Simon
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32

← 1 2 3 4 5 →