Optimal threshold probability in undiscounted Markov decision processes with a target set

被引:16
|
作者
Ohtsubo, Y [1 ]
机构
[1] Kochi Univ, Fac Sci, Dept Math & Informat Sci, Kochi 7808520, Japan
关键词
Markov decision process; minimizing risk model; existence of optimal policy; value iteration; policy improvement method;
D O I
10.1016/S0096-3003(03)00158-9
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We consider risk minimizing problems in undiscounted Markov decisions processes with a target set. We formulate the problem as an infinite horizon case with a recurrent class. We show that an optimal value function is a unique solution to an optimality equation and there exists an stationary optimal policy. Also we give several value iteration methods and a policy improvement method. (C) 2003 Elsevier Inc. All rights reserved.
引用
收藏
页码:519 / 532
页数:14
相关论文
共 50 条