Convergence rate of linear two-time-scale stochastic approximation

被引：80

作者：

Konda, VR ^{[1
]}

Tsitsiklis, JN ^{[1
]}

机构：

[1] MIT, Informat & Decis Syst Lab, Cambridge, MA 02139 USA

来源：

ANNALS OF APPLIED PROBABILITY | 2004年 / 14卷 / 02期

关键词：

stochastic approximation; two-time-scales;

D O I：

10.1214/105051604000000116

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

We study the rate of convergence of linear two-time-scale stochastic approximation methods. We consider two-time-scale linear iterations driven by i.i.d. noise, prove some results on their asymptotic covariance and establish asymptotic normality. The well-known result [Polyak, B. T. (1990). Automat. Remote Contr 51 937-946; Ruppert, D. (1988). Technical Report 781, Cornell Univ.] on the optimality of Polyak-Ruppert averaging techniques specialized to linear stochastic approximation is established as a consequence of the general results in this paper.

引用

页码：796 / 819

页数：24

共 17 条

[1]

BARAS JS, 2000, P 39 IEEE C DEC CONT

[2]

Benveniste A, 1990, Adaptive algorithms and stochastic approximations

[3]

BHATNAGAR, 2001, IIE T, V3, P245

[4] Optimal structured feedback policies for ABR flow control using two-timescale SPSA [J].

Bhatnagar, S ;

Fu, MC ;

Marcus, SI ;

Fard, PJ .

IEEE-ACM TRANSACTIONS ON NETWORKING, 2001, 9 (04) :479-491

[5] Stochastic approximation with two time scales [J].

Borkar, VS .

SYSTEMS & CONTROL LETTERS, 1997, 29 (05) :291-294

[6]

Duflo M., 1997, RANDOM ITERATIVE MOD

[7] APPLICATIONS OF SINGULAR PERTURBATION TECHNIQUES TO CONTROL-PROBLEMS [J].

KOKOTOVIC, PV .

SIAM REVIEW, 1984, 26 (04) :501-550

[8]

Konda V., 2002, THESIS MIT

[9] Actor-critic-type learning algorithms for Markov decision processes [J].

Konda, VR ;

Borkar, VS .

SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1999, 38 (01) :94-123

[10] STOCHASTIC-APPROXIMATION WITH AVERAGING OF THE ITERATES - OPTIMAL ASYMPTOTIC RATE OF CONVERGENCE FOR GENERAL PROCESSES [J].

KUSHNER, HJ ;

YANG, JC .

SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1993, 31 (04) :1045-1062

← 1 2 →