Generalizing Markov decision processes to imprecise probabilities

被引：20

作者：

Harmanec, D ^{[1
]}

机构：

[1] Acad Sci Czech Republ, Inst Comp Sci, Prague 18207 8, Czech Republic

来源：

JOURNAL OF STATISTICAL PLANNING AND INFERENCE | 2002年 / 105卷 / 01期

关键词：

generalized Markov decision process; sequential decision making; interval utilities;

D O I：

10.1016/S0378-3758(01)00210-5

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

This paper is a first step towards generalizing the concept of a Markov decision process to imprecise probabilities. A concept of a generalized Markov decision process is defined and motivated. Finite horizon, fully observable models with total cumulative reward optimality criterion are studied. The imprecision in the model opens up a possibility of indecision. A solution procedure, that generalizes the backward induction method from the classical theory, is developed. This procedure finds all maximal (i.e., undominated) policies for a given generalized Markov decision process. An example illustrating the solution method is given. The directions for further research are discussed. (C) 2002 Elsevier Science B.V. All rights reserved.

引用

页码：199 / 213

页数：15

共 17 条

[1] Choquet G., 1954, ANN I FOURIER GRENOB, V5, P131, DOI [10.5802/aif.53, DOI 10.5802/AIF.53]
[2] Cozman F. G., 1996, P 12 ANN C UNC ART I, P186
[3] Dean T., 1997, P 13 ANN C UNCERTAIN, P124
[4] DOAN A, 1996, P 12 C UNC ART INT, P228
[5] GIVAN R, 1997, CS9705 BROWN U
[6] GIVAN R, 1997, LECT NOTES ARTIF INT, V4134, P234
[7] HA V, 1996, P 12 C UNC ART INT A, P291
[8] Littman M.L., 1996, THESIS BROWN U PROVI
[9] Puterman M.L., 2008, Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics
[10] MARKOVIAN DECISION PROCESSES WITH UNCERTAIN TRANSITION PROBABILITIES
SATIA, JK
LAVE, RE
[J]. OPERATIONS RESEARCH, 1973, 21 (03) : 728 - 740

← 1 2 →