Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

被引：4

作者：

Guo, Xianping ^{[1
]}

Huang, Yonghui ^{[1
]}

Zhang, Yi ^{[2
]}

机构：

[1] Sun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R China

[2] Univ Liverpool, Dept Math Sci, Liverpool L69 7ZL, Merseyside, England

来源：

APPLIED MATHEMATICS AND OPTIMIZATION | 2017年 / 75卷 / 02期

关键词：

Continuous-time Markov decision process; Constrained-optimality; Finite horizon; Mixture of N+1 deterministic Markov policies; Occupation measure;

D O I：

10.1007/s00245-016-9352-6

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

This paper studies the constrained (nonhomogeneous) continuous-time Markov decision processes on the finite horizon. The performance criterion to be optimized is the expected total reward on the finite horizon, while N constraints are imposed on similar expected costs. Introducing the appropriate notion of the occupation measures for the concerned optimal control problem, we establish the following under some suitable conditions: (a) the class of Markov policies is sufficient; (b) every extreme point of the space of performance vectors is generated by a deterministic Markov policy; and (c) there exists an optimal Markov policy, which is a mixture of no more than N + 1 deterministic Markov policies.

引用

页码：317 / 341

页数：25

共 50 条

[41] Bisimulation and logical preservation for continuous-time Markov decision processes
Neuhaeusser, Martin R.
Katoen, Joost-Pieter
CONCUR 2007 - CONCURRENCY THEORY, PROCEEDINGS, 2007, 4703 : 412 - +
[42] Bisimulations and Logical Characterizations on Continuous-Time Markov Decision Processes
Song, Lei
Zhang, Lijun
Godskesen, Jens Chr.
VERIFICATION, MODEL CHECKING, AND ABSTRACT INTERPRETATION: (VMCAI 2014), 2014, 8318 : 98 - 117
[43] Bias optimality for multichain continuous-time Markov decision processes
Guo, Xianping
Song, XinYuan
Zhang, Junyu
OPERATIONS RESEARCH LETTERS, 2009, 37 (05) : 317 - 321
[44] A survey of recent results on continuous-time Markov decision processes
Guo, Xianping
Hernandez-Lerma, Onesimo
Prieto-Rumeau, Tomas
TOP, 2006, 14 (02) : 177 - 243
[45] RANDOMIZED AND RELAXED STRATEGIES IN CONTINUOUS-TIME MARKOV DECISION PROCESSES
Piunovskiy, Alexey
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2015, 53 (06) : 3503 - 3533
[46] A survey of recent results on continuous-time Markov decision processes
Xianping Guo
Onésimo Hernández-Lerma
Tomás Prieto-Rumeau
Xi-Ren Cao
Junyu Zhang
Qiying Hu
Mark E. Lewis
Ricardo Vélez
TOP, 2006, 14 : 177 - 261
[47] A characterization of meaningful schedulers for continuous-time Markov decision processes
Wolovick, Nicolas
Johr, Sven
FORMAL MODELING AND ANALYSIS OF TIMED SYSTEMS, 2006, 4202 : 352 - 367
[48] SEQUENTIAL ESTIMATION FOR CONTINUOUS-TIME FINITE MARKOV-PROCESSES
ADKE, SR
MANJUNATH, SM
STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 1984, 18 (02) : 227 - 227
[49] SEQUENTIAL ESTIMATION FOR CONTINUOUS-TIME FINITE MARKOV-PROCESSES
ADKE, SR
MANJUNATH, SM
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1984, 13 (09) : 1089 - 1106
[50] A Remark on Finite Horizon Markov Decision processes
XikUi Wang (University of Saskatchewan
Canada)
经济数学, 1989, (05) : 76 - 80

← 1 2 3 4 5 →