Markov Decision Processes on Borel Spaces with Total Cost and Random Horizon

被引:6
|
作者
Cruz-Suarez, Hugo [1 ]
Ilhuicatzi-Roldan, Rocio [1 ]
Montes-de-Oca, Raul [2 ]
机构
[1] Benemerita Univ Autonoma Puebla, Fac Ciencias Fis Matemat, Puebla, Mexico
[2] Univ Autonoma Metropolitana Iztapalapa, Dept Matemat, Mexico City 09340, DF, Mexico
关键词
Markov decision process; Total cost; Random horizon; Varying-time discount factor;
D O I
10.1007/s10957-012-0262-8
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper deals with Markov Decision Processes (MDPs) on Borel spaces with possibly unbounded costs. The criterion to be optimized is the expected total cost with a random horizon of infinite support. In this paper, it is observed that this performance criterion is equivalent to the expected total discounted cost with an infinite horizon and a varying-time discount factor. Then, the optimal value function and the optimal policy are characterized through some suitable versions of the Dynamic Programming Equation. Moreover, it is proved that the optimal value function of the optimal control problem with a random horizon can be bounded from above by the optimal value function of a discounted optimal control problem with a fixed discount factor. In this case, the discount factor is defined in an adequate way by the parameters introduced for the study of the optimal control problem with a random horizon. To illustrate the theory developed, a version of the Linear-Quadratic model with a random horizon and a Logarithm Consumption-Investment model are presented.
引用
收藏
页码:329 / 346
页数:18
相关论文
共 50 条
  • [41] Distributionally Robust Markov Decision Processes
    Xu, Huan
    Mannor, Shie
    MATHEMATICS OF OPERATIONS RESEARCH, 2012, 37 (02) : 288 - 300
  • [42] Extreme point characterization of constrained nonstationary infinite-horizon Markov decision processes with finite state space
    Lee, Ilbin
    Epelman, Marina A.
    Romeijn, H. Edwin
    Smith, Robert L.
    OPERATIONS RESEARCH LETTERS, 2014, 42 (03) : 238 - 245
  • [43] EXTREME OCCUPATION MEASURES IN MARKOV DECISION PROCESSES WITH AN ABSORBING STATE
    Piunovskiy, Alexey
    Zhang, Yi
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2024, 62 (01) : 65 - 90
  • [44] Markov decision processes under observability constraints
    Serin, Y
    Kulkarni, V
    MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2005, 61 (02) : 311 - 328
  • [45] Bayesian Learning of Noisy Markov Decision Processes
    Singh, Sumeetpal S.
    Chopin, Nicolas
    Whiteley, Nick
    ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION, 2013, 23 (01):
  • [46] The complexity of decentralized control of Markov decision processes
    Bernstein, DS
    Givan, R
    Immerman, N
    Zilberstein, S
    MATHEMATICS OF OPERATIONS RESEARCH, 2002, 27 (04) : 819 - 840
  • [47] Reachability analysis of quantum Markov decision processes
    Ying, Shenggang
    Ying, Mingsheng
    INFORMATION AND COMPUTATION, 2018, 263 : 31 - 51
  • [48] Ranking policies in discrete Markov decision processes
    Dai, Peng
    Goldsmith, Judy
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2010, 59 (01) : 107 - 123
  • [49] BISIMULATION METRICS FOR CONTINUOUS MARKOV DECISION PROCESSES
    Ferns, Norm
    Panangaden, Prakash
    Precup, Doina
    SIAM JOURNAL ON COMPUTING, 2011, 40 (06) : 1662 - 1714
  • [50] Episodic task learning in Markov decision processes
    Yong Lin
    Fillia Makedon
    Yurong Xu
    Artificial Intelligence Review, 2011, 36 : 87 - 98