Revisiting the ODE Method for Recursive Algorithms: Fast Convergence Using Quasi Stochastic Approximation

被引：0

作者：

Shuhang Chen

Adithya Devraj

Andrey Berstein

Sean Meyn

机构：

[1] University of Florida,Department of Mathematics

[2] Stanford University,undefined

[3] NREL,undefined

[4] University of Florida,undefined

来源：

Journal of Systems Science and Complexity | 2021年 / 34卷

关键词：

Learning and adaptive systems in artificial intelligence; reinforcement learning; stochastic approximation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Several decades ago, Profs. Sean Meyn and Lei Guo were postdoctoral fellows at ANU, where they shared interest in recursive algorithms. It seems fitting to celebrate Lei Guo’s 60th birthday with a review of the ODE Method and its recent evolution, with focus on the following themes: The method has been regarded as a technique for algorithm analysis. It is argued that this viewpoint is backwards: The original stochastic approximation method was surely motivated by an ODE, and tools for analysis came much later (based on establishing robustness of Euler approximations). The paper presents a brief survey of recent research in machine learning that shows the power of algorithm design in continuous time, following by careful approximation to obtain a practical recursive algorithm.While these methods are usually presented in a stochastic setting, this is not a prerequisite. In fact, recent theory shows that rates of convergence can be dramatically accelerated by applying techniques inspired by quasi Monte-Carlo.Subject to conditions, the optimal rate of convergence can be obtained by applying the averaging technique of Polyak and Ruppert. The conditions are not universal, but theory suggests alternatives to achieve acceleration.The theory is illustrated with applications to gradient-free optimization, and policy gradient algorithms for reinforcement learning.

引用

页码：1681 / 1702

页数：21

共 61 条

[1]

Chen H F(1986)Convergence rate of least-squares identification and adaptive control for stochastic systems Intl. Journal of Control 44 1459-1476

[2]

Guo L(1990)Estimating time-varying parameters by the Kalman filter based algorithm: Stability and convergence IEEE Trans. Automat. Control 35 141-147

[3]

Guo L(1987)A new approach to stochastic adaptive control IEEE Trans. Automat. Control AC-32 220-226

[4]

Meyn S P(1989)Adaptive control for time-varying systems: A combination of martingale and Markov chain techniques Int. J. Adaptive Control and Signal Processing 3 1-14

[5]

Caines P E(2000)How much uncertainty can be dealt with by feedback? IEEE Transaction on Automatic Control 45 2203-2217

[6]

Guo L(1994)Stability of recursive stochastic tracking algorithms SIAM J. Control Optim. 32 1195-1225

[7]

Meyn S P(1995)Performance analysis of general tracking algorithms IEEE Transactions on Automatic Control 40 1388-1402

[8]

Xie L L(1951)A stochastic approximation method Annals of Mathematical Statistics 22 400-407

[9]

Guo L(1994)On the convergence of stochastic iterative dynamic programming algorithms Neural Computation 6 1185-1201

[10]

Guo L(1994)Asynchronous stochastic approximation and Machine Learning 16 185-202

← 1 2 3 4 5 6 7 →