Optimal non-asymptotic analysis of the Ruppert-Polyak averaging stochastic algorithm

被引:7
|
作者
Gadat, Sebastien [1 ]
Panloup, Fabien [2 ]
机构
[1] Univ Toulouse 1 Capitole, Inst Univ France, Toulouse Sch Econ, CNRS UMR 5314,Esplanade Univ, Toulouse, France
[2] Univ Angers, CNRS, LAREMA, SFR MATHSTIC, F-49000 Angers, France
关键词
Optimization; Averaging; Stochastic Gradient Descent; HILBERT-SPACES; APPROXIMATION; COMPLEXITY; DESCENT; BOUNDS;
D O I
10.1016/j.spa.2022.11.012
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper is devoted to the non-asymptotic analysis of the Ruppert-Polyak averaging method introduced in Polyak and Juditsky (1992) and Ruppert (1988)[26] for the minimization of a smooth function f with a stochastic algorithm. We first establish a general non-asymptotic optimal bound: if (theta) over capn is the position of the algorithm at step n, we prove thatE| (theta) over cap (n) - arg min(f)|(2) <= Tr(Sigma*) n + Cd, f n(-r beta),where Sigma* is the limiting covariance matrix of the CLT demonstrated in Polyak and Juditsky (1992) and Cd, fn(-r beta) is a new state-of-the-art second order term that translates the effect of the dimension. We also identify the optimal gain of the baseline SGD gamma n = gamma n(-3/4), leading to a second-order term with r(3/4) = 5/4. Second, we show that this result holds under some Kurdyka-Lojiasewicz-type condition (Kurdyka, 1988; Lojasiewicz, 1963) for function f , which is far more general than the standard uniformly strongly convex case. In particular, it makes it possible to handle some pathological examples such as on-line learning for logistic regression and recursive quantile estimation.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:312 / 348
页数:37
相关论文
共 16 条
  • [1] Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Streaming Data
    Godichon-Baggioni, Antoine
    Werge, Nicklas
    Wintenberger, Olivier
    ESAIM-PROBABILITY AND STATISTICS, 2023, 27 : 482 - 514
  • [2] Non-asymptotic confidence bounds for the optimal value of a stochastic program
    Guigues, Vincent
    Juditsky, Anatoli
    Nemirovski, Arkadi
    OPTIMIZATION METHODS & SOFTWARE, 2017, 32 (05) : 1033 - 1058
  • [3] Non-asymptotic Analysis of Biased Stochastic Approximation Scheme
    Karimi, Belhal
    Miasojedow, Blazej
    Moulines, Eric
    Wai, Hoi-To
    CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [4] Almost Sure Convergence and Non-Asymptotic Concentration Bounds for Stochastic Mirror Descent Algorithm
    Paul, Anik Kumar
    Mahindrakar, Arun D.
    Kalaimani, Rachel K.
    IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 2397 - 2402
  • [5] Non-asymptotic Analysis of Stochastic Methods for Non-Smooth Non-Convex Regularized Problems
    Xu, Yi
    Jin, Rong
    Yang, Tianbao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [6] Normal Approximation for Stochastic Gradient Descent via Non-Asymptotic Rates of Martingale CLT
    Anastasiou, Andreas
    Balasubramanian, Krishnakumar
    Erdogdu, Murat A.
    CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [7] Asymptotic properties of dual averaging algorithm for constrained distributed stochastic optimization
    Zhao, Shengchao
    Chen, Xing-Min
    Liu, Yongchao
    SYSTEMS & CONTROL LETTERS, 2022, 165
  • [8] Non-asymptotic distributional bounds for the Dickman approximation of the running time of the Quickselect algorithm
    Goldstein, Larry
    ELECTRONIC JOURNAL OF PROBABILITY, 2018, 23
  • [9] Stochastic Optimization for DC Functions and Non-smooth Non-convex Regularizers with Non-asymptotic Convergence
    Xu, Yi
    Qi, Qi
    Lin, Qihang
    Jin, Rong
    Yang, Tianbao
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [10] Non-asymptotic analysis of ensemble Kalman updates: effective dimension and localization
    Al-Ghattas, Omar
    Sanz-Alonso, Daniel
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2024, 13 (01)