On Almost Sure Convergence Rates of Stochastic Gradient Methods

被引:0
|
作者
Liu, Jun [1 ]
Yuan, Ye [2 ,3 ]
机构
[1] Univ Waterloo, Dept Appl Math, Waterloo, ON, Canada
[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan, Peoples R China
[3] Huazhong Univ Sci & Technol, Sch Mech Sci & Engn, Wuhan, Peoples R China
来源
CONFERENCE ON LEARNING THEORY, VOL 178 | 2022年 / 178卷
基金
加拿大自然科学与工程研究理事会;
关键词
Stochastic gradient descent; stochastic heavy-ball; stochastic Nesterov's accelerated gradient; almost sure convergence rate; OPTIMIZATION; BOUNDS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The vast majority of convergence rates analysis for stochastic gradient methods in the literature focus on convergence in expectation, whereas trajectory-wise almost sure convergence is clearly important to ensure that any instantiation of the stochastic algorithms would converge with probability one. Here we provide a unified almost sure convergence rates analysis for stochastic gradient descent (SGD), stochastic heavy-ball (SHB), and stochastic Nesterov's accelerated gradient (SNAG) methods. We show, for the first time, that the almost sure convergence rates obtained for these stochastic gradient methods on strongly convex functions, are arbitrarily close to their optimal convergence rates possible. For non-convex objective functions, we not only show that a weighted average of the squared gradient norms converges to zero almost surely, but also the last iterates of the algorithms. We further provide last-iterate almost sure convergence rates analysis for stochastic gradient methods on general convex smooth functions, in contrast with most existing results in the literature that only provide convergence in expectation for a weighted average of the iterates.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] On the Convergence of Stochastic Gradient Descent for Linear Inverse Problems in Banach Spaces
    Jin, Bangti
    Kereta, Zeljko
    SIAM JOURNAL ON IMAGING SCIENCES, 2023, 16 (02): : 671 - 705
  • [22] Katyusha: The First Direct Acceleration of Stochastic Gradient Methods
    Allen-Zhu, Zeyuan
    STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2017, : 1200 - 1205
  • [23] Almost sure Stability of Stochastic Switched Systems: Graph lifts-based Approach
    Della Rossa, Matteo
    Jungers, Raphael M.
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 1021 - 1026
  • [24] Convergence Rates of Distributed Two-Time-Scale Gradient Methods under Random Quantization
    Doan, Thinh T.
    Maguluri, Siva Theja
    Romberg, Justin
    IFAC PAPERSONLINE, 2019, 52 (20): : 267 - 272
  • [25] Fast Convergence Stochastic Parallel Gradient Descent Algorithm
    Hu Dongting
    Shen Wen
    Ma Wenchao
    Liu Xinyu
    Su Zhouping
    Zhu Huaxin
    Zhang Xiumei
    Que Lizhi
    Zhu Zhuowei
    Zhang Yixin
    Chen Guoqing
    Hu Lifa
    LASER & OPTOELECTRONICS PROGRESS, 2019, 56 (12)
  • [26] Optimized convergence of stochastic gradient descent by weighted averaging
    Hagedorn, Melinda
    Jarre, Florian
    OPTIMIZATION METHODS & SOFTWARE, 2024, 39 (04) : 699 - 724
  • [27] Convergence analysis of distributed stochastic gradient descent with shuffling
    Meng, Qi
    Chen, Wei
    Wang, Yue
    Ma, Zhi-Ming
    Liu, Tie-Yan
    NEUROCOMPUTING, 2019, 337 : 46 - 57
  • [28] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Bai-cun Zhou
    Cong-ying Han
    Tian-de Guo
    Acta Mathematicae Applicatae Sinica, English Series, 2021, 37 : 126 - 136
  • [29] The Convergence of Stochastic Gradient Descent in Asynchronous Shared Memory
    Alistarh, Dan
    De Sa, Christopher
    Konstantinov, Nikola
    PODC'18: PROCEEDINGS OF THE 2018 ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2018, : 169 - 177
  • [30] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Zhou, Bai-cun
    Han, Cong-ying
    Guo, Tian-de
    ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2021, 37 (01): : 126 - 136