ON THE CONVERGENCE AND ODE LIMIT OF A 2-DIMENSIONAL STOCHASTIC-APPROXIMATION

被引:0
作者
MA, DJ
MAKOWSKI, AM
机构
[1] UNIV MARYLAND,DEPT ELECT ENGN,COLLEGE PK,MD 20742
[2] UNIV MARYLAND,INST SYST RES,COLLEGE PK,MD 20742
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider a two-dimensional stochastic approximations scheme of the Robbins-Monro type which naturally arises in the study of steering policies for Markov decision processes [6], [7]. Making use of a decoupling change of variables, we establish its almost sure convergence by ad-hoc arguments that combine standard results on one-dimensional stochastic approximations with a version of the law of large numbers for martingale differences. We use this direct analysis to guide us in selecting the test function which appears in standard convergence results for multidimensional schemes. Furthermore, although a blind application of the ODE method is not possible here due to a lack of regularity properties, the aforementioned change of variables paves the way for an interpretation of the behavior of solutions to the associated limiting ODE.
引用
收藏
页码:1439 / 1442
页数:4
相关论文
共 9 条
  • [1] MULTIDIMENSIONAL STOCHASTIC APPROXIMATION METHODS
    BLUM, JR
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1954, 25 (04): : 737 - 744
  • [2] GLADYSHEV EG, 1985, THEO PROB APPL, V10, P275
  • [3] HALL P, 1980, MARTINGALE LIMIT THE
  • [4] Loeve M., 1977, PROBABILITY THEORY, V1
  • [5] MA DJ, 1988, 27TH P IEEE C DEC CO, P1192
  • [6] MA DJ, 1992, 31ST P IEEE C DEC CO, P3344
  • [7] MA DJ, 1988, THESIS U MARYLAND CO
  • [8] APPLICATIONS OF A KUSHNER AND CLARK LEMMA TO GENERAL CLASSES OF STOCHASTIC ALGORITHMS
    METIVIER, M
    PRIOURET, P
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1984, 30 (02) : 140 - 151
  • [9] NEVELSON M, 1976, AMS T MATH MONOGRAPH, V47