On Almost Sure Convergence Rates of Stochastic Gradient Methods

被引：0

作者：

Liu, Jun ^{[1
]}

Yuan, Ye ^{[2
,3
]}

机构：

[1] Univ Waterloo, Dept Appl Math, Waterloo, ON, Canada

[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan, Peoples R China

[3] Huazhong Univ Sci & Technol, Sch Mech Sci & Engn, Wuhan, Peoples R China

来源：

CONFERENCE ON LEARNING THEORY, VOL 178 | 2022年 / 178卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Stochastic gradient descent; stochastic heavy-ball; stochastic Nesterov's accelerated gradient; almost sure convergence rate; OPTIMIZATION; BOUNDS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The vast majority of convergence rates analysis for stochastic gradient methods in the literature focus on convergence in expectation, whereas trajectory-wise almost sure convergence is clearly important to ensure that any instantiation of the stochastic algorithms would converge with probability one. Here we provide a unified almost sure convergence rates analysis for stochastic gradient descent (SGD), stochastic heavy-ball (SHB), and stochastic Nesterov's accelerated gradient (SNAG) methods. We show, for the first time, that the almost sure convergence rates obtained for these stochastic gradient methods on strongly convex functions, are arbitrarily close to their optimal convergence rates possible. For non-convex objective functions, we not only show that a weighted average of the squared gradient norms converges to zero almost surely, but also the last iterates of the algorithms. We further provide last-iterate almost sure convergence rates analysis for stochastic gradient methods on general convex smooth functions, in contrast with most existing results in the literature that only provide convergence in expectation for a weighted average of the iterates.

引用

页数：21

共 50 条

[21] On the Convergence of Stochastic Gradient Descent for Linear Inverse Problems in Banach Spaces
Jin, Bangti
Kereta, Zeljko
SIAM JOURNAL ON IMAGING SCIENCES, 2023, 16 (02): : 671 - 705
[22] Katyusha: The First Direct Acceleration of Stochastic Gradient Methods
Allen-Zhu, Zeyuan
STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2017, : 1200 - 1205
[23] Almost sure Stability of Stochastic Switched Systems: Graph lifts-based Approach
Della Rossa, Matteo
Jungers, Raphael M.
2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 1021 - 1026
[24] Convergence Rates of Distributed Two-Time-Scale Gradient Methods under Random Quantization
Doan, Thinh T.
Maguluri, Siva Theja
Romberg, Justin
IFAC PAPERSONLINE, 2019, 52 (20): : 267 - 272
[25] Fast Convergence Stochastic Parallel Gradient Descent Algorithm
Hu Dongting
Shen Wen
Ma Wenchao
Liu Xinyu
Su Zhouping
Zhu Huaxin
Zhang Xiumei
Que Lizhi
Zhu Zhuowei
Zhang Yixin
Chen Guoqing
Hu Lifa
LASER & OPTOELECTRONICS PROGRESS, 2019, 56 (12)
[26] Optimized convergence of stochastic gradient descent by weighted averaging
Hagedorn, Melinda
Jarre, Florian
OPTIMIZATION METHODS & SOFTWARE, 2024, 39 (04) : 699 - 724
[27] Convergence analysis of distributed stochastic gradient descent with shuffling
Meng, Qi
Chen, Wei
Wang, Yue
Ma, Zhi-Ming
Liu, Tie-Yan
NEUROCOMPUTING, 2019, 337 : 46 - 57
[28] Convergence of Stochastic Gradient Descent in Deep Neural Network
Bai-cun Zhou
Cong-ying Han
Tian-de Guo
Acta Mathematicae Applicatae Sinica, English Series, 2021, 37 : 126 - 136
[29] The Convergence of Stochastic Gradient Descent in Asynchronous Shared Memory
Alistarh, Dan
De Sa, Christopher
Konstantinov, Nikola
PODC'18: PROCEEDINGS OF THE 2018 ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2018, : 169 - 177
[30] Convergence of Stochastic Gradient Descent in Deep Neural Network
Zhou, Bai-cun
Han, Cong-ying
Guo, Tian-de
ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2021, 37 (01): : 126 - 136

← 1 2 3 4 5 →