共 61 条
[11]
Guo L(1977)-learning Trans. on Automatic Control 22 551-575
[12]
Ljung L(2020)Analysis of recursive stochastic algorithms IFAC — PapersOnLine 53 1373-1378
[13]
Robbins H(2000)Adaptive and robust control in the USSR Communications in Contemporary Mathematics 2 1-34
[14]
Monro S(2017)The heavy ball with friction method, I. the continuous dynamical system: Global exploration of the local minima of a real-valued function by asymptotic analysis of a dissipative dynamical system Proc. Advances in Neural Information Processing Systems 30 6796-6806
[15]
Jaakola T(2016)Acceleration and averaging in stochastic descent dynamics Proc. of the National Academy of Sciences 113 E7351-E7358
[16]
Jordan M(1990)A variational perspective on accelerated methods in optimization Statistics 21 251-272
[17]
Singh S(2012)Sequences with low discrepancy generalisation and application to Robbins-Monro algorithm Monte Carlo Methods and Applications 18 1-51
[18]
Tsitsiklis J(2000)Stochastic approximation with averaging innovation applied to finance SIAM J. Control Optim. 38 447-469
[19]
Ljung L(2019)The ODE method for convergence of stochastic approximation and reinforcement learning IEEE Transactions on Automatic Control 64 2614-2620
[20]
Fradkov A(1992)Stability of stochastic approximations with ‘controlled Markov’ noise and temporal difference learning SIAM J. Control Optim. 30 838-855