MARKOV DECISION-MODELS WITH WEIGHTED DISCOUNTED CRITERIA

被引:37
作者
FEINBERG, EA [1 ]
SHWARTZ, A [1 ]
机构
[1] TECHNION ISRAEL INST TECHNOL,DEPT ELECT ENGN,IL-32000 HAIFA,ISRAEL
关键词
DYNAMIC PROGRAMMING; MARKOV; SUM OF DISCOUNTED REWARDS WITH DIFFERENT DISCOUNT FACTORS;
D O I
10.1287/moor.19.1.152
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
We consider a discrete time Markov Decision Process with infinite horizon. The criterion to be maximized is the sum of a number of standard discounted rewards, each with a different discount factor. Situations in which such criteria arise include modeling investments, production, modeling projects of different durations and systems with multiple criteria, and some axiomatic formulations of multi-attribute preference theory. We show that for this criterion for some positive epsilon there need not exist an epsilon-optimal (randomized) stationary strategy, even when the state and action sets are finite. However, epsilon-optimal Markov (nonrandomized) strategies and optimal Markov strategies exist under weak conditions. We exhibit epsilon-optimal Markov strategies which are stationary from some time onward. When both state and action spaces are finite, there exists an optimal Markov strategy with this property. We provide an explicit algorithm for the computation of such strategies and give a description of the set of optimal strategies.
引用
收藏
页码:152 / 168
页数:17
相关论文
共 29 条
[1]  
Bertsekas Dimitri P., 2018, ABSTRACT DYNAMIC PRO
[2]   DISCRETE DYNAMIC-PROGRAMMING [J].
BLACKWELL, D .
ANNALS OF MATHEMATICAL STATISTICS, 1962, 33 (02) :719-&
[3]  
Brealey R. A., 1988, PRINCIPLES CORPORATE
[4]   MARKOV RENEWAL PROGRAMS WITH SMALL INTEREST RATES [J].
DENARDO, EV .
ANNALS OF MATHEMATICAL STATISTICS, 1971, 42 (02) :477-&
[5]   CONTRACTION MAPPINGS IN THEORY UNDERLYING DYNAMIC PROGRAMMING [J].
DENARDO, EV .
SIAM REVIEW, 1967, 9 (02) :165-&
[6]  
DERMAN C, 1970, FINITE STATE MARKOVI
[7]  
Dubins L. E., 1977, Mathematics of Operations Research, V2, P125, DOI 10.1287/moor.2.2.125
[8]  
Dynkin EB, 1979, CONTROLLED MARKOV PR
[9]  
FEDERGRUEN A, 1983, MARKOVIAN CONTROL PR, V97
[10]  
Feinberg E. A, 1982, TH PROBABILITY ITS A, P486