Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

被引:4
|
作者
Guo, Xianping [1 ]
Huang, Yonghui [1 ]
Zhang, Yi [2 ]
机构
[1] Sun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R China
[2] Univ Liverpool, Dept Math Sci, Liverpool L69 7ZL, Merseyside, England
来源
APPLIED MATHEMATICS AND OPTIMIZATION | 2017年 / 75卷 / 02期
关键词
Continuous-time Markov decision process; Constrained-optimality; Finite horizon; Mixture of N+1 deterministic Markov policies; Occupation measure;
D O I
10.1007/s00245-016-9352-6
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
This paper studies the constrained (nonhomogeneous) continuous-time Markov decision processes on the finite horizon. The performance criterion to be optimized is the expected total reward on the finite horizon, while N constraints are imposed on similar expected costs. Introducing the appropriate notion of the occupation measures for the concerned optimal control problem, we establish the following under some suitable conditions: (a) the class of Markov policies is sufficient; (b) every extreme point of the space of performance vectors is generated by a deterministic Markov policy; and (c) there exists an optimal Markov policy, which is a mixture of no more than N + 1 deterministic Markov policies.
引用
收藏
页码:317 / 341
页数:25
相关论文
共 50 条