A characterization of the optimal risk-sensitive average cost in finite controlled Markov chains

被引:29
作者
Cavazos-Cadena, R [1 ]
Hernández-Hernández, D
机构
[1] Univ Autonoma Agraria Antonio Narro, Dept Estadist & Calculo, Saltillo 25315, Coahuila, Mexico
[2] Ctr Invest Matemat, Guanajuato 36000, GTO, Mexico
关键词
decreasing function along trajectories; stopping time; nearly optimal policies; Holder's inequality; simultaneous Doeblin condition; recurrent state;
D O I
10.1214/105051604000000585
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This work concerns controlled Markov chains with finite state and action spaces. The transition law satisfies the simultaneous Doeblin condition, and the performance of a control policy is measured by the (long-run) risk-sensitive average cost criterion associated to a positive, but otherwise arbitrary, risk sensitivity coefficient. Within this context, the optimal risk-sensitive average cost is characterized via a minimization problem in a finite-dimensional Euclidean space.
引用
收藏
页码:175 / 212
页数:38
相关论文
共 21 条
[11]  
Dynkin E.B., 1979, Grundlehren der Mathematischen Wissenschaften, V235
[12]  
Fleming WH, 1999, ANN APPL PROBAB, V9, P871
[13]   Risk-sensitive control of finite state machines on an infinite horizon .1. [J].
Fleming, WH ;
HernandezHernanadez, D .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1997, 35 (05) :1790-1810
[15]  
Hernandez-Hernandez D, 1998, SYST CONTROL LETT, V34, P105
[16]   Risk sensitive control of Markov processes in countable state space [J].
HernandezHernandez, D ;
Marcus, SI .
SYSTEMS & CONTROL LETTERS, 1996, 29 (03) :147-155
[17]  
HERNANDEZLERMA O, 1988, ADAPTIVE MARKOV CONT
[18]  
HORDJIK A, 1974, DYNAMIC PROGRAMMING
[19]   RISK-SENSITIVE MARKOV DECISION PROCESSES [J].
HOWARD, RA ;
MATHESON, JE .
MANAGEMENT SCIENCE SERIES A-THEORY, 1972, 18 (07) :356-369
[20]  
Puterman ML., 1994, Wiley Series in Probability and Statistics, DOI 10.1002/9780470316887