Value iteration in average cost Markov control processes on borel spaces

被引:8
作者
MontesdeOca, R
HernandezLerma, O
机构
[1] UNIV AUTONOMA METROPOLITANA IZTAPALAPA,DEPT MATEMAT,MEXICO CITY 09340,DF,MEXICO
[2] INST POLITECN NACL,CINVESTAV,DEPT MATEMAT,MEXICO CITY 07000,DF,MEXICO
关键词
Markov control (or decision) processes; average cost; value iteration (or successive approximations);
D O I
10.1007/BF00047169
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
This paper deals with discrete-time Markov control processes with Borel state and control spaces, with possibly unbounded costs and noncompact control constraint sets, and the average cost criterion. Conditions are given for the convergence of the value iteration algorithm to the optimal average cost, and for a sequence of finite-horizon optimal policies to have an accumulation point which is average cost optimal.
引用
收藏
页码:203 / 222
页数:20
相关论文
共 42 条
  • [1] [Anonymous], 1963, J MATH ANAL APPL
  • [2] [Anonymous], J MATH SYST ESTIMATI
  • [3] DISCRETE-TIME CONTROLLED MARKOV-PROCESSES WITH AVERAGE COST CRITERION - A SURVEY
    ARAPOSTATHIS, A
    BORKAR, VS
    FERNANDEZGAUCHERAND, E
    GHOSH, MK
    MARCUS, SI
    [J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1993, 31 (02) : 282 - 344
  • [4] Ash R. B., 1972, REAL ANAL PROBABILIT, DOI DOI 10.1016/C2013-0-06164-6
  • [5] Bertsekas D. P., 1987, DYNAMIC PROGRAMMING
  • [6] DIEBOLT J, 1990, CNRSURA1321 U PAR 6
  • [7] Dugundji J., 1966, TOPOLOGY
  • [8] Dynkin E.B., 1979, Grundlehren der Mathematischen Wissenschaften, V235
  • [9] FERNANDEZGAUCHE.E, 1992, 31ST P IEEE CDC TUCS