On the Last Iterate Convergence of Momentum Methods

被引:0
|
作者
Li, Xiaoyu [1 ]
Liu, Mingrui [2 ]
Orabona, Francesco [3 ]
机构
[1] Boston Univ, Div Syst Engn, Boston, MA 02215 USA
[2] George Mason Univ, Dept Comp Sci, Fairfax, VA 22030 USA
[3] Boston Univ, Elect & Comp Engn, Boston, MA 02215 USA
来源
INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 167 | 2022年 / 167卷
基金
美国国家科学基金会;
关键词
Convex Optimization; Momentum methods; Stochastic Optimization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
SGD with Momentum (SGDM) is a widely used family of algorithms for large-scale optimization of machine learning problems. Yet, when optimizing generic convex functions, no advantage is known for any SGDM algorithm over plain SGD. Moreover, even the most recent results require changes to the SGDM algorithms, like averaging of the iterates and a projection onto a bounded domain, which are rarely used in practice. In this paper, we focus on the convergence rate of the last iterate of SGDM. For the first time, we prove that for any constant momentum factor, there exists a Lipschitz and convex function for which the last iterate of SGDM suffers from a suboptimal convergence rate of Omega(ln T/root T) after T iterations. Based on this fact, we study a class of (both adaptive and non-adaptive) Follow-The-Regularized-Leader-based SGDM algorithms with increasing momentum and shrinking updates. For these algorithms, we show that the last iterate has optimal convergence O(1/root T) for unconstrained convex stochastic optimization problems without projections onto bounded domains nor knowledge of T. Further, we show a variety of results for FTRL-based SGDM when used with adaptive stepsizes. Empirical results are shown as well.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Last-iterate convergence analysis of stochastic momentum methods for neural networks
    Liu, Jinlan
    Xu, Dongpo
    Lu, Yinghua
    Kong, Jun
    Mandic, Danilo P.
    NEUROCOMPUTING, 2023, 527 : 27 - 35
  • [2] Last-iterate Convergence in Extensive-Form Games
    Lee, Chung-Wei
    Kroer, Christian
    Luo, Haipeng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] On Last-Iterate Convergence Beyond Zero-Sum Games
    Anagnostides, Ioannis
    Panageas, Ioannis
    Farina, Gabriele
    Sandholm, Tuomas
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 536 - 581
  • [4] Last iterate convergence of SGD for Least-Squares in the Interpolation regime
    Varre, Aditya
    Pillaud-Vivien, Loucas
    Flammarion, Nicolas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] Linear Last-Iterate Convergence for Continuous Games with Coupled Inequality Constraints
    Meng, Min
    Li, Xiuxian
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 1076 - 1081
  • [6] Last-Iterate Convergence of Optimistic Gradient Method for Monotone Variational Inequalities
    Gorbunov, Eduard
    Taylor, Adrien
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [7] On the Convergence of N-FINDR and Related Algorithms: To Iterate or Not to Iterate?
    Dowler, Shaun
    Andrews, Mark
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2011, 8 (01) : 4 - 8
  • [8] Revisit last-iterate convergence of mSGD under milder requirement on step size
    Jin, Ruinan
    He, Xingkang
    Chen, Lang
    Cheng, Difei
    Gupta, Vijay
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [9] Extragradient-type methods with O1/k last-iterate convergence rates for co-hypomonotone inclusions
    Tran-Dinh, Quoc
    JOURNAL OF GLOBAL OPTIMIZATION, 2024, 89 (01) : 197 - 221
  • [10] Last-Iterate Convergence Rates for Min-Max Optimization: Convergence of Hamiltonian Gradient Descent and Consensus Optimization
    Abernethy, Jacob
    Lai, Kevin A.
    Wibisono, Andre
    ALGORITHMIC LEARNING THEORY, VOL 132, 2021, 132