CONVERGENCE AND DYNAMICAL BEHAVIOR OF THE ADAM ALGORITHM FOR NONCONVEX STOCHASTIC OPTIMIZATION

被引:52
|
作者
Barakat, Anas [1 ]
Bianchi, Pascal [1 ]
机构
[1] Inst Polytech Paris, Telecom Paris, LTCI, F-91120 Palaiseau, France
关键词
stochastic approximation; dynamical systems; adaptive gradient methods;
D O I
10.1137/19M1263443
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Adam is a popular variant of stochastic gradient descent for finding a local minimizer of a function. In the constant stepsize regime, assuming that the objective function is differentiable and nonconvex, we establish the convergence in the long run of the iterates to a stationary point under a stability condition. The key ingredient is the introduction of a continuous-time version of Adam, under the form of a nonautonomous ordinary differential equation. This continuous-time system is a relevant approximation of the Adam iterates, in the sense that the interpolated Adam process converges weakly toward the solution to the ODE. The existence and the uniqueness of the solution are established. We further show the convergence of the solution toward the critical points of the objective function and quantify its convergence rate under a Lojasiewicz assumption. Then, we introduce a novel decreasing stepsize version of Adam. Under mild assumptions, it is shown that the iterates are almost surely bounded and converge almost surely to critical points of the objective function. Finally, we analyze the fluctuations of the algorithm by means of a conditional central limit theorem.
引用
收藏
页码:244 / 274
页数:31
相关论文
共 50 条
  • [21] SIGNPROX: ONE-BIT PROXIMAL ALGORITHM FOR NONCONVEX STOCHASTIC OPTIMIZATION
    Xu, Xiaojian
    Kamilov, Ulugbek S.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7800 - 7804
  • [22] Online Distributed Stochastic Gradient Algorithm for Nonconvex Optimization With Compressed Communication
    Li, Jueyou
    Li, Chaojie
    Fan, Jing
    Huang, Tingwen
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (02) : 936 - 951
  • [23] Proximal stochastic recursive momentum algorithm for nonsmooth nonconvex optimization problems
    Wang, Zhaoxin
    Wen, Bo
    OPTIMIZATION, 2024, 73 (02) : 481 - 495
  • [24] Distributed Adaptive Gradient Algorithm With Gradient Tracking for Stochastic Nonconvex Optimization
    Han, Dongyu
    Liu, Kun
    Lin, Yeming
    Xia, Yuanqing
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (09) : 6333 - 6340
  • [25] Stochastic Anderson Mixing for Nonconvex Stochastic Optimization
    Wei, Fuchao
    Bao, Chenglong
    Liu, Yang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [26] Global convergence of a BFGS-type algorithm for nonconvex multiobjective optimization problems
    Prudente, L. F.
    Souza, D. R.
    COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2024, 88 (03) : 719 - 757
  • [27] A quasi-Newton algorithm for nonconvex, nonsmooth optimization with global convergence guarantees
    Curtis F.E.
    Que X.
    Mathematical Programming Computation, 2015, 7 (4) : 399 - 428
  • [28] Convergence behavior of diffusion stochastic gradient descent algorithm
    Barani, Fatemeh
    Savadi, Abdorreza
    Yazdi, Hadi Sadoghi
    SIGNAL PROCESSING, 2021, 183
  • [29] An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax Optimization
    Chen, Lesi
    Ye, Haishan
    Luo, Luo
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [30] GNSD: A GRADIENT-TRACKING BASED NONCONVEX STOCHASTIC ALGORITHM FOR DECENTRALIZED OPTIMIZATION
    Lu, Songtao
    Zhang, Xinwei
    Sun, Haoran
    Hong, Mingyi
    2019 IEEE DATA SCIENCE WORKSHOP (DSW), 2019, : 315 - 321