Generalizing Markov decision processes to imprecise probabilities

被引:20
作者
Harmanec, D [1 ]
机构
[1] Acad Sci Czech Republ, Inst Comp Sci, Prague 18207 8, Czech Republic
关键词
generalized Markov decision process; sequential decision making; interval utilities;
D O I
10.1016/S0378-3758(01)00210-5
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper is a first step towards generalizing the concept of a Markov decision process to imprecise probabilities. A concept of a generalized Markov decision process is defined and motivated. Finite horizon, fully observable models with total cumulative reward optimality criterion are studied. The imprecision in the model opens up a possibility of indecision. A solution procedure, that generalizes the backward induction method from the classical theory, is developed. This procedure finds all maximal (i.e., undominated) policies for a given generalized Markov decision process. An example illustrating the solution method is given. The directions for further research are discussed. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:199 / 213
页数:15
相关论文
共 17 条
  • [1] Choquet G., 1954, ANN I FOURIER GRENOB, V5, P131, DOI [10.5802/aif.53, DOI 10.5802/AIF.53]
  • [2] Cozman F. G., 1996, P 12 ANN C UNC ART I, P186
  • [3] Dean T., 1997, P 13 ANN C UNCERTAIN, P124
  • [4] DOAN A, 1996, P 12 C UNC ART INT, P228
  • [5] GIVAN R, 1997, CS9705 BROWN U
  • [6] GIVAN R, 1997, LECT NOTES ARTIF INT, V4134, P234
  • [7] HA V, 1996, P 12 C UNC ART INT A, P291
  • [8] Littman M.L., 1996, THESIS BROWN U PROVI
  • [9] Puterman M.L., 2008, Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics
  • [10] MARKOVIAN DECISION PROCESSES WITH UNCERTAIN TRANSITION PROBABILITIES
    SATIA, JK
    LAVE, RE
    [J]. OPERATIONS RESEARCH, 1973, 21 (03) : 728 - 740