CHARACTERIZATION OF OPTIMAL POLICIES IN VECTOR-VALUED MARKOVIAN DECISION-PROCESSES

被引:25
作者
FURUKAWA, N
机构
关键词
922 Statistical Methods;
D O I
10.1287/moor.5.2.271
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Infinite horizon Markovian decision processes with R**p-valued additive utilities are considered. The optimization criterion, here, is a pseudo-order preference relation induced by a convex cone in R**p. The state space is a countable set, and the action space is a compact metric space. Certain assumptions on the continuity of the reward vector and the transition probability are made. In this setting, an algorithm improving policies with respect to the chosen preference relation is given. A point-to-set mapping is defined, and optimal policies are characterized by fixed points of the mapping which are maximal in the set of all fixed points.
引用
收藏
页码:271 / 279
页数:9
相关论文
共 5 条
  • [1] Blackwell D., 1965, ANN MATH STAT, V36, P226
  • [2] BROWN TA, 1965, MATH ANAL APPL, V12, P364
  • [3] PREFERENCE ORDER DYNAMIC-PROGRAMMING
    MITTEN, LG
    [J]. MANAGEMENT SCIENCE SERIES A-THEORY, 1974, 21 (01): : 43 - 46
  • [4] ORDINAL DYNAMIC-PROGRAMMING
    SOBEL, MJ
    [J]. MANAGEMENT SCIENCE SERIES A-THEORY, 1975, 21 (09): : 967 - 975
  • [5] [No title captured]