Convergence and Stability of the Stochastic Proximal Point Algorithm with Momentum

被引:0
作者
Kim, Junhyung Lyle [1 ]
Toulis, Panos [2 ]
Kyrillidis, Anastasios [1 ]
机构
[1] Rice Univ, Dept Comp Sci, Houston, TX 77251 USA
[2] Univ Chicago, Booth Sch Business, Chicago, IL 60637 USA
来源
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 168 | 2022年 / 168卷
关键词
empirical risk minimization; stochastic proximal point algorithm; momentum; stability; OPTIMIZATION; APPROXIMATION; ACCELERATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Stochastic gradient descent with momentum (SGDM) is the dominant algorithm in many optimization scenarios, including convex optimization instances and non-convex neural network training. Yet, in the stochastic setting, momentum interferes with gradient noise, often leading to specific step size and momentum choices in order to guarantee convergence, set aside acceleration. Proximal point methods, on the other hand, have gained much attention due to their numerical stability and elasticity against imperfect tuning. Their stochastic accelerated variants though have received limited attention: how momentum interacts with the stability of (stochastic) proximal point methods remains largely unstudied. To address this, we focus on the convergence and stability of the stochastic proximal point algorithm with momentum (SPPAM), and show that SPPAM allows a faster linear convergence to a neighborhood compared to stochastic proximal point algorithm (SPPA) with a better contraction factor, under proper hyperparameter tuning. In terms of stability, we show that SPPAM depends on problem constants more favorably than SGDM, allowing a wider range of step size and momentum that lead to convergence.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Ahn K, 2022, Simplicity in Algori, P117
  • [2] [Anonymous], 2013, Proc. Advances in Neural Information Processing Systems
  • [3] [Anonymous], 2004, P 21 INT C MACH LEAR
  • [4] STOCHASTIC (APPROXIMATE) PROXIMAL POINT METHODS: CONVERGENCE, OPTIMALITY, AND ADAPTIVITY
    Asi, Hilal
    Duchi, John C.
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2019, 29 (03) : 2257 - 2290
  • [5] Asi Hilal, 2020, 34 C NEUR INF PROC S, P11
  • [6] Assran M, 2020, PR MACH LEARN RES, V119
  • [7] Bach FR, 2011, Advances in Neural Information Processing Systems
  • [8] Bottou Leon, 2012, Neural Networks: Tricks of the Trade. Second Edition: LNCS 7700, P421, DOI 10.1007/978-3-642-35289-8_25
  • [9] Optimization Methods for Large-Scale Machine Learning
    Bottou, Leon
    Curtis, Frank E.
    Nocedal, Jorge
    [J]. SIAM REVIEW, 2018, 60 (02) : 223 - 311
  • [10] Bottou Leon, 2007, Advances in neural information processing systems, V20