REVISITING THE CENTRAL LIMIT THEOREMS FOR THE SGD-TYPE METHODS

被引:0
作者
Li, Tiejun [1 ,2 ,3 ]
Xiao, Tiannan [2 ,4 ]
Yang, Guoguo [2 ,4 ]
机构
[1] Peking Univ, Lab Math & Its Applicat LMAM, Beijing 100871, Peoples R China
[2] Peking Univ, Sch Math Sci, Beijing 100871, Peoples R China
[3] Peking Univ, Ctr Machine Learning Res, Beijing 100871, Peoples R China
[4] Peking Univ, LMAM, Beijing 100871, Peoples R China
关键词
Central limit theorem; SGD; momentum SGD; Nesterov acceleration; CONVERGENCE;
D O I
暂无
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We revisited the central limit theorem (CLT) for stochastic gradient descent (SGD) type methods, including the vanilla SGD, momentum SGD and Nesterov accelerated SGD methods with constant or vanishing damping parameters. By taking advantage of Lyapunov function technique and L-p bound estimates, we established the CLT under more general conditions on learning rates for broader classes of SGD methods as compared to previous results. The CLT for the time average was also investigated, and we found that it held in the linear case, while it was not generally true in nonlinear situation. Numerical tests were also carried out to verify our theoretical analysis.
引用
收藏
页码:1427 / 1454
页数:28
相关论文
共 28 条
  • [1] Assran M, 2020, PR MACH LEARN RES, V119
  • [2] Barakat A, 2020, PR MACH LEARN RES, V129, P225
  • [3] Gradient convergence in gradient methods with errors
    Bertsekas, DP
    Tsitsiklis, JN
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2000, 10 (03) : 627 - 642
  • [4] Borkar S., 2008, Stochastic Approximation: A Dynamical Systems Viewpoint, P2
  • [5] Borkar V, 2024, Arxiv, DOI arXiv:2110.14427
  • [6] Optimization Methods for Large-Scale Machine Learning
    Bottou, Leon
    Curtis, Frank E.
    Nocedal, Jorge
    [J]. SIAM REVIEW, 2018, 60 (02) : 223 - 311
  • [7] Chen H., 2002, Stochastic approximation and its application
  • [8] STOCHASTIC FIRST- AND ZEROTH-ORDER METHODS FOR NONCONVEX STOCHASTIC PROGRAMMING
    Ghadimi, Saeed
    Lan, Guanghui
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2013, 23 (04) : 2341 - 2368
  • [9] Hall C.C., 1980, Martingale Limit Theory and Its Application, P2
  • [10] Jin R., 2022, arXiv