Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes

被引:63
作者
Coraluppi, SP
Marcus, SI
机构
[1] ALPHATECH Inc, Burlington, MA 01803 USA
[2] Univ Maryland, Dept Elect Engn, College Pk, MD 20742 USA
[3] Univ Maryland, Syst Res Inst, College Pk, MD 20742 USA
基金
美国国家科学基金会;
关键词
stochastic control; risk-sensitive control; minimax control; Markov decision processes;
D O I
10.1016/S0005-1098(98)00153-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper analyzes a connection between risk-sensitive and minimax criteria for discrete-time, finite-state Markov decision processes (MDPs). We synthesize optimal policies with respect to both criteria, both for the finite horizon and the discounted infinite horizon problem. A generalized decision-making framework is introduced, which includes as special cases a number of approaches that have been considered in the literature. The framework allows for discounted risk-sensitive and minimax formulations leading to stationary optimal policies on the infinite horizon. We illustrate our results with a simple machine replacement problem. (C) 1999 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:301 / 309
页数:9
相关论文
共 26 条
[1]  
BARAS J, 1997, J MATH SYSTEM ESTIMA, V7
[2]   TOTAL RISK-AVERSION, STOCHASTIC OPTIMAL-CONTROL, AND DIFFERENTIAL-GAMES [J].
BARRON, EN ;
JENSEN, R .
APPLIED MATHEMATICS AND OPTIMIZATION, 1989, 19 (03) :313-327
[3]  
Basar T, 1995, H OPTIMAL CONTROL RE
[4]   OPTIMAL-CONTROL OF PARTIALLY OBSERVABLE STOCHASTIC-SYSTEMS WITH AN EXPONENTIAL-OF-INTEGRAL PERFORMANCE INDEX [J].
BENSOUSSAN, A ;
VANSCHUPPEN, JH .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1985, 23 (04) :599-613
[5]  
Bertsekas D. P., 1971, Proceedings of the 1971 IEEE Conference on Decision and Control (Including the 10th Symposium on Adaptive Processes), P451
[6]   DISCOUNTED MDP - DISTRIBUTION-FUNCTIONS AND EXPONENTIAL UTILITY MAXIMIZATION [J].
CHUNG, KJ ;
SOBEL, MJ .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1987, 25 (01) :49-62
[7]  
CORALUPPI SP, 1997, THESIS U MARYLAND
[8]  
EAGLE J, 1975, THESIS STANFORD U
[9]   SUBSTITUTION, RISK-AVERSION, AND THE TEMPORAL BEHAVIOR OF CONSUMPTION AND ASSET RETURNS - A THEORETICAL FRAMEWORK [J].
EPSTEIN, LG ;
ZIN, SE .
ECONOMETRICA, 1989, 57 (04) :937-969
[10]   Risk-sensitive optimal control of hidden Markov models: Structural results [J].
FernandezGaucherand, E ;
Marcus, SI .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1997, 42 (10) :1418-1422