Optimal threshold probability and expectation in semi-Markov decision processes

被引:12
作者
Sakaguchi, Masahiko [1 ]
Ohtsubo, Yoshio [1 ]
机构
[1] Kochi Univ, Fac Sci, Dept Math, Kochi 7808520, Japan
关键词
Semi-Markov decision process; Optimal threshold probability; Existence of optimal policy; Value iteration; Policy improvement method; Stochastic order; MINIMIZING RISK MODELS;
D O I
10.1016/j.amc.2010.04.007
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We consider undiscounted semi-Markov decision process with a target set and our main concern is a problem minimizing threshold probability. We formulate the problem as an infinite horizon case with a recurrent class. We show that an optimal value function is a unique solution to an optimality equation and there exists a stationary optimal policy. Also several value iteration methods and a policy improvement method are given in our model. Furthermore, we investigate a relationship between threshold probabilities and expectations for total rewards. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:2947 / 2958
页数:12
相关论文
共 21 条
[1]  
[Anonymous], J FLUIDS ENG
[2]   CONTROLLED SEMI-MARKOV MODELS - THE DISCOUNTED CASE [J].
BHATTACHARYA, RN ;
MAJUMDAR, M .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1989, 21 (03) :365-381
[3]   DISCRETE DYNAMIC-PROGRAMMING [J].
BLACKWELL, D .
ANNALS OF MATHEMATICAL STATISTICS, 1962, 33 (02) :719-&
[4]  
BLACKWELL D, 1967, 5TH P BERK S MATH ST, V1, P415
[5]   ON SEQUENTIAL DECISIONS AND MARKOV-CHAINS [J].
DERMAN, C .
MANAGEMENT SCIENCE, 1962, 9 (01) :16-24
[6]  
Derman Cyrus, 1970, Finite state Markovian decision pro- cesses
[7]  
Federgruen A., 1979, Stochastic Processes & their Applications, V9, P223, DOI 10.1016/0304-4149(79)90034-6
[8]   DENUMERABLE UNDISCOUNTED SEMI-MARKOV DECISION-PROCESSES WITH UNBOUNDED REWARDS [J].
FEDERGRUEN, A ;
SCHWEITZER, PJ ;
TIJMS, HC .
MATHEMATICS OF OPERATIONS RESEARCH, 1983, 8 (02) :298-313
[9]  
Feinberg EA, 2002, MARKOV PROCESSES AND CONTROLLED MARKOV CHAINS, P233
[10]  
HERNANDEZ-LERMA O., 1999, Further Topics on Discrete-Time Markov Control Processes