Constrained average cost Markov control processes in Borel spaces

被引:44
作者
Hernández-Lerma, O
González-Hernández, J
López-Martínez, RR
机构
[1] Inst Politecn Nacl, Ctr Invest & Estudios Avanzados, Dept Matemat, Mexico City 07000, DF, Mexico
[2] Univ Nacl Autonoma Mexico, Dept Probabil & Estadist, IIMAS, Mexico City 01000, DF, Mexico
[3] Univ Veracruzana, Fac Matemat, Xalapa 91090, Veracruz, Mexico
关键词
constrained Markov control processes; average cost; discounted cost;
D O I
10.1137/S0363012999361627
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper considers constrained Markov control processes in Borel spaces, with unbounded costs. The criterion to be minimized is a long-run expected average cost, and the constraints can be imposed on similar average costs, or on average rewards, or discounted costs or rewards. We give conditions under which the constrained problem (CP) is solvable and equivalent to an equality constrained (EC) linear program. Furthermore, we show that there is no duality gap between EC and the dual program EC* and that in fact the strong duality condition holds. Finally, we introduce an explicit procedure to solve CP in some cases which is illustrated with a detailed example.
引用
收藏
页码:442 / 468
页数:27
相关论文
共 43 条
[1]  
Altman E., 1999, STOCH MODEL SER
[2]  
Anderson EJ, 1987, LINEAR PROGRAMMING I
[3]   DISCRETE-TIME CONTROLLED MARKOV-PROCESSES WITH AVERAGE COST CRITERION - A SURVEY [J].
ARAPOSTATHIS, A ;
BORKAR, VS ;
FERNANDEZGAUCHERAND, E ;
GHOSH, MK ;
MARCUS, SI .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1993, 31 (02) :282-344
[4]  
Ash RB., 1972, REAL ANAL PROBABILIT
[5]  
BERTSEKAS D. P, 1978, Neuro-dynamic programming
[6]  
BILLINGSLEY P., 1999, Convergence of Probability Measures, V2nd, DOI 10.1002/9780470316962
[7]   MEMORYLESS STRATEGIES IN FINITE-STAGE DYNAMIC-PROGRAMMING [J].
BLACKWELL, D .
ANNALS OF MATHEMATICAL STATISTICS, 1964, 35 (02) :863-&
[8]   ERGODIC CONTROL OF MARKOV-CHAINS WITH CONSTRAINTS - THE GENERAL-CASE [J].
BORKAR, VS .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1994, 32 (01) :176-186
[9]   GENERALIZATIONS OF FARKAS THEOREM [J].
CRAVEN, BD ;
KOLIHA, JJ .
SIAM JOURNAL ON MATHEMATICAL ANALYSIS, 1977, 8 (06) :983-997
[10]   AVERAGE, SENSITIVE AND BLACKWELL OPTIMAL POLICIES IN DENUMERABLE MARKOV DECISION CHAINS WITH UNBOUNDED REWARDS [J].
DEKKER, R ;
HORDIJK, A .
MATHEMATICS OF OPERATIONS RESEARCH, 1988, 13 (03) :395-420