Optimal non-asymptotic analysis of the Ruppert-Polyak averaging stochastic algorithm

被引：7

作者：

Gadat, Sebastien ^{[1
]}

Panloup, Fabien ^{[2
]}

机构：

[1] Univ Toulouse 1 Capitole, Inst Univ France, Toulouse Sch Econ, CNRS UMR 5314,Esplanade Univ, Toulouse, France

[2] Univ Angers, CNRS, LAREMA, SFR MATHSTIC, F-49000 Angers, France

来源：

STOCHASTIC PROCESSES AND THEIR APPLICATIONS | 2023年 / 156卷

关键词：

Optimization; Averaging; Stochastic Gradient Descent; HILBERT-SPACES; APPROXIMATION; COMPLEXITY; DESCENT; BOUNDS;

D O I：

10.1016/j.spa.2022.11.012

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

This paper is devoted to the non-asymptotic analysis of the Ruppert-Polyak averaging method introduced in Polyak and Juditsky (1992) and Ruppert (1988)[26] for the minimization of a smooth function f with a stochastic algorithm. We first establish a general non-asymptotic optimal bound: if (theta) over capn is the position of the algorithm at step n, we prove thatE| (theta) over cap (n) - arg min(f)|(2) <= Tr(Sigma*) n + Cd, f n(-r beta),where Sigma* is the limiting covariance matrix of the CLT demonstrated in Polyak and Juditsky (1992) and Cd, fn(-r beta) is a new state-of-the-art second order term that translates the effect of the dimension. We also identify the optimal gain of the baseline SGD gamma n = gamma n(-3/4), leading to a second-order term with r(3/4) = 5/4. Second, we show that this result holds under some Kurdyka-Lojiasewicz-type condition (Kurdyka, 1988; Lojasiewicz, 1963) for function f , which is far more general than the standard uniformly strongly convex case. In particular, it makes it possible to handle some pathological examples such as on-line learning for logistic regression and recursive quantile estimation.(c) 2022 Elsevier B.V. All rights reserved.

引用

页码：312 / 348

页数：37

共 16 条

[1] Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Streaming Data
Godichon-Baggioni, Antoine
Werge, Nicklas
Wintenberger, Olivier
ESAIM-PROBABILITY AND STATISTICS, 2023, 27 : 482 - 514
[2] Non-asymptotic confidence bounds for the optimal value of a stochastic program
Guigues, Vincent
Juditsky, Anatoli
Nemirovski, Arkadi
OPTIMIZATION METHODS & SOFTWARE, 2017, 32 (05) : 1033 - 1058
[3] Non-asymptotic Analysis of Biased Stochastic Approximation Scheme
Karimi, Belhal
Miasojedow, Blazej
Moulines, Eric
Wai, Hoi-To
CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
[4] Almost Sure Convergence and Non-Asymptotic Concentration Bounds for Stochastic Mirror Descent Algorithm
Paul, Anik Kumar
Mahindrakar, Arun D.
Kalaimani, Rachel K.
IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 2397 - 2402
[5] Non-asymptotic Analysis of Stochastic Methods for Non-Smooth Non-Convex Regularized Problems
Xu, Yi
Jin, Rong
Yang, Tianbao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[6] Normal Approximation for Stochastic Gradient Descent via Non-Asymptotic Rates of Martingale CLT
Anastasiou, Andreas
Balasubramanian, Krishnakumar
Erdogdu, Murat A.
CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
[7] Asymptotic properties of dual averaging algorithm for constrained distributed stochastic optimization
Zhao, Shengchao
Chen, Xing-Min
Liu, Yongchao
SYSTEMS & CONTROL LETTERS, 2022, 165
[8] Non-asymptotic distributional bounds for the Dickman approximation of the running time of the Quickselect algorithm
Goldstein, Larry
ELECTRONIC JOURNAL OF PROBABILITY, 2018, 23
[9] Stochastic Optimization for DC Functions and Non-smooth Non-convex Regularizers with Non-asymptotic Convergence
Xu, Yi
Qi, Qi
Lin, Qihang
Jin, Rong
Yang, Tianbao
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[10] Non-asymptotic analysis of ensemble Kalman updates: effective dimension and localization
Al-Ghattas, Omar
Sanz-Alonso, Daniel
INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2024, 13 (01)

← 1 2 →