On Almost Sure Convergence Rates of Stochastic Gradient Methods

被引:0
|
作者
Liu, Jun [1 ]
Yuan, Ye [2 ,3 ]
机构
[1] Univ Waterloo, Dept Appl Math, Waterloo, ON, Canada
[2] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan, Peoples R China
[3] Huazhong Univ Sci & Technol, Sch Mech Sci & Engn, Wuhan, Peoples R China
来源
CONFERENCE ON LEARNING THEORY, VOL 178 | 2022年 / 178卷
基金
加拿大自然科学与工程研究理事会;
关键词
Stochastic gradient descent; stochastic heavy-ball; stochastic Nesterov's accelerated gradient; almost sure convergence rate; OPTIMIZATION; BOUNDS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The vast majority of convergence rates analysis for stochastic gradient methods in the literature focus on convergence in expectation, whereas trajectory-wise almost sure convergence is clearly important to ensure that any instantiation of the stochastic algorithms would converge with probability one. Here we provide a unified almost sure convergence rates analysis for stochastic gradient descent (SGD), stochastic heavy-ball (SHB), and stochastic Nesterov's accelerated gradient (SNAG) methods. We show, for the first time, that the almost sure convergence rates obtained for these stochastic gradient methods on strongly convex functions, are arbitrarily close to their optimal convergence rates possible. For non-convex objective functions, we not only show that a weighted average of the squared gradient norms converges to zero almost surely, but also the last iterates of the algorithms. We further provide last-iterate almost sure convergence rates analysis for stochastic gradient methods on general convex smooth functions, in contrast with most existing results in the literature that only provide convergence in expectation for a weighted average of the iterates.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Almost Sure Convergence Rates Analysis and Saddle Avoidance of Stochastic Gradient Methods
    Liu, Jun
    Yuan, Ye
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 40
  • [2] ON THE CONVERGENCE OF MOMENTS IN THE ALMOST SURE CENTRAL LIMIT THEOREM FOR STOCHASTIC APPROXIMATION ALGORITHMS
    Cenac, Peggy
    ESAIM-PROBABILITY AND STATISTICS, 2013, 17 : 179 - 194
  • [3] On the Convergence Rates of Policy Gradient Methods
    Xiao, Lin
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [4] On the Almost Sure Convergence Rates for Pairwise Negative Quadrant Dependent Random Variables
    Xing, G. -D.
    THAI JOURNAL OF MATHEMATICS, 2010, 8 (01): : 171 - 184
  • [5] Convergence Rates of Distributed Gradient Methods Under Random Quantization: A Stochastic Approximation Approach
    Doan, Thinh T.
    Maguluri, Siva Theja
    Romberg, Justin
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (10) : 4469 - 4484
  • [6] Almost sure convergence rates for system identification using binary, quantized, and regular sensors
    Mei, Hongwei
    Wang, Le Yi
    Yin, George
    AUTOMATICA, 2014, 50 (08) : 2120 - 2127
  • [7] On the rates of convergence of parallelized averaged stochastic gradient algorithms
    Godichon-Baggioni, Antoine
    Saadane, Sofiane
    STATISTICS, 2020, 54 (03) : 618 - 635
  • [8] Almost Sure Convergence and Non-Asymptotic Concentration Bounds for Stochastic Mirror Descent Algorithm
    Paul, Anik Kumar
    Mahindrakar, Arun D.
    Kalaimani, Rachel K.
    IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 2397 - 2402
  • [9] Modified stochastic gradient identification algorithms with fast convergence rates
    Chen, Jing
    Ding, Feng
    JOURNAL OF VIBRATION AND CONTROL, 2011, 17 (09) : 1281 - 1286
  • [10] Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates
    Vaswani, Sharan
    Mishkin, Aaron
    Laradji, Issam
    Schmidt, Mark
    Gidel, Gauthier
    Lacoste-Julien, Simon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32