Strong n-discount and finite-horizon optimality for continuous-time Markov decision processes

被引:0
作者
Quanxin Zhu
Xianping Guo
机构
[1] Nanjing Normal University,School of Mathematical Sciences and Institute of Finance and Statistics
[2] Zhongshan University,School of Mathematics and Computational Science
来源
Journal of Systems Science and Complexity | 2014年 / 27卷
关键词
Continuous-time Markov decision process; expected average reward criterion; finite-horizon optimality; Polish space; strong ; -discount optimality;
D O I
暂无
中图分类号
学科分类号
摘要
This paper studies the strong n(n = −1, 0)-discount and finite horizon criteria for continuous-time Markov decision processes in Polish spaces. The corresponding transition rates are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. Under mild conditions, the authors prove the existence of strong n(n = −1, 0)-discount optimal stationary policies by developing two equivalence relations: One is between the standard expected average reward and strong −1-discount optimality, and the other is between the bias and strong 0-discount optimality. The authors also prove the existence of an optimal policy for a finite horizon control problem by developing an interesting characterization of a canonical triplet.
引用
收藏
页码:1045 / 1063
页数:18
相关论文
共 47 条
[1]  
Arapostathis A(1993)Discretetime controlled Markov processes with average cost criterion: A survey SIAM J. Control Optim. 31 282-344
[2]  
Borkar V S(2006)Average optimality for continuous-time Markov decision processes in Polish spaces Ann. Appl. Probab. 16 730-756
[3]  
Fernández-Gaucherand E(2007)Average optimality inequality for continuous-time Markov decision processes in Polish spaces Math. Methods Oper. Res. 66 299-313
[4]  
Ghosh M K(2008)Average optimality for continuous-time Markov decision processes with a policy iteration approach J. Math. Anal. Appl. 339 691-704
[5]  
Markus S I(1999)Sample-path optimality and varianceminimization of average cost Markov control processes SIAM J. Control Optim. 38 79-93
[6]  
Guo X P(2007)Markov decision processes with variance minimization: A new condition and approach Stoch. Anal. Appl. 25 577-592
[7]  
Rieder U(2003)Bias optimality versus strong 0-discount optimality in Markov control processes with unbounded costs Acta Appl. Math. 77 215-235
[8]  
Zhu Q X(2009)Blackwell optimality for controlled diffusion processes J. Appl. Probab. 46 372-391
[9]  
Zhu Q X(2000)A note on bias optimality in controlled queueing systems J. Appl. Probab. 37 300-305
[10]  
Hernández-Lerma O(1966)On finding optimal policies in discrete dynamic programming with no discounting Ann. Math. Statist. 37 1284-1294