CHARACTERIZATION OF OPTIMAL POLICIES IN VECTOR-VALUED MARKOVIAN DECISION-PROCESSES

被引：25

作者：

FURUKAWA, N

机构：

来源：

MATHEMATICS OF OPERATIONS RESEARCH | 1980年 / 5卷 / 02期

关键词：

922 Statistical Methods;

D O I：

10.1287/moor.5.2.271

中图分类号：

C93 [管理学]; O22 [运筹学];

学科分类号：

070105 ; 12 ; 1201 ; 1202 ; 120202 ;

摘要：

Infinite horizon Markovian decision processes with R**p-valued additive utilities are considered. The optimization criterion, here, is a pseudo-order preference relation induced by a convex cone in R**p. The state space is a countable set, and the action space is a compact metric space. Certain assumptions on the continuity of the reward vector and the transition probability are made. In this setting, an algorithm improving policies with respect to the chosen preference relation is given. A point-to-set mapping is defined, and optimal policies are characterized by fixed points of the mapping which are maximal in the set of all fixed points.

引用

页码：271 / 279

页数：9