Optimal threshold probability and expectation in semi-Markov decision processes

被引：12

作者：

Sakaguchi, Masahiko ^{[1
]}

Ohtsubo, Yoshio ^{[1
]}

机构：

[1] Kochi Univ, Fac Sci, Dept Math, Kochi 7808520, Japan

来源：

APPLIED MATHEMATICS AND COMPUTATION | 2010年 / 216卷 / 10期

关键词：

Semi-Markov decision process; Optimal threshold probability; Existence of optimal policy; Value iteration; Policy improvement method; Stochastic order; MINIMIZING RISK MODELS;

D O I：

10.1016/j.amc.2010.04.007

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

We consider undiscounted semi-Markov decision process with a target set and our main concern is a problem minimizing threshold probability. We formulate the problem as an infinite horizon case with a recurrent class. We show that an optimal value function is a unique solution to an optimality equation and there exists a stationary optimal policy. Also several value iteration methods and a policy improvement method are given in our model. Furthermore, we investigate a relationship between threshold probabilities and expectations for total rewards. (C) 2010 Elsevier Inc. All rights reserved.

引用

页码：2947 / 2958

页数：12

共 21 条

[1]

[Anonymous], J FLUIDS ENG

[2] CONTROLLED SEMI-MARKOV MODELS - THE DISCOUNTED CASE [J].

BHATTACHARYA, RN ;

MAJUMDAR, M .

JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1989, 21 (03) :365-381

[3] DISCRETE DYNAMIC-PROGRAMMING [J].

BLACKWELL, D .

ANNALS OF MATHEMATICAL STATISTICS, 1962, 33 (02) :719-&

[4]

BLACKWELL D, 1967, 5TH P BERK S MATH ST, V1, P415

[5] ON SEQUENTIAL DECISIONS AND MARKOV-CHAINS [J].

DERMAN, C .

MANAGEMENT SCIENCE, 1962, 9 (01) :16-24

[6]

Derman Cyrus, 1970, Finite state Markovian decision pro- cesses

[7]

Federgruen A., 1979, Stochastic Processes & their Applications, V9, P223, DOI 10.1016/0304-4149(79)90034-6

[8] DENUMERABLE UNDISCOUNTED SEMI-MARKOV DECISION-PROCESSES WITH UNBOUNDED REWARDS [J].

FEDERGRUEN, A ;

SCHWEITZER, PJ ;

TIJMS, HC .

MATHEMATICS OF OPERATIONS RESEARCH, 1983, 8 (02) :298-313

[9]

Feinberg EA, 2002, MARKOV PROCESSES AND CONTROLLED MARKOV CHAINS, P233

[10]

HERNANDEZ-LERMA O., 1999, Further Topics on Discrete-Time Markov Control Processes

← 1 2 3 →