Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes

被引：63

作者：

Coraluppi, SP

Marcus, SI

机构：

[1] ALPHATECH Inc, Burlington, MA 01803 USA

[2] Univ Maryland, Dept Elect Engn, College Pk, MD 20742 USA

[3] Univ Maryland, Syst Res Inst, College Pk, MD 20742 USA

来源：

AUTOMATICA | 1999年 / 35卷 / 02期

基金：

美国国家科学基金会;

关键词：

stochastic control; risk-sensitive control; minimax control; Markov decision processes;

D O I：

10.1016/S0005-1098(98)00153-8

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper analyzes a connection between risk-sensitive and minimax criteria for discrete-time, finite-state Markov decision processes (MDPs). We synthesize optimal policies with respect to both criteria, both for the finite horizon and the discounted infinite horizon problem. A generalized decision-making framework is introduced, which includes as special cases a number of approaches that have been considered in the literature. The framework allows for discounted risk-sensitive and minimax formulations leading to stationary optimal policies on the infinite horizon. We illustrate our results with a simple machine replacement problem. (C) 1999 Elsevier Science Ltd. All rights reserved.

引用

页码：301 / 309

页数：9

共 26 条

[1]

BARAS J, 1997, J MATH SYSTEM ESTIMA, V7

[2] TOTAL RISK-AVERSION, STOCHASTIC OPTIMAL-CONTROL, AND DIFFERENTIAL-GAMES [J].

BARRON, EN ;

JENSEN, R .

APPLIED MATHEMATICS AND OPTIMIZATION, 1989, 19 (03) :313-327

[3]

Basar T, 1995, H OPTIMAL CONTROL RE

[4] OPTIMAL-CONTROL OF PARTIALLY OBSERVABLE STOCHASTIC-SYSTEMS WITH AN EXPONENTIAL-OF-INTEGRAL PERFORMANCE INDEX [J].

BENSOUSSAN, A ;

VANSCHUPPEN, JH .

SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1985, 23 (04) :599-613

[5]

Bertsekas D. P., 1971, Proceedings of the 1971 IEEE Conference on Decision and Control (Including the 10th Symposium on Adaptive Processes), P451

[6] DISCOUNTED MDP - DISTRIBUTION-FUNCTIONS AND EXPONENTIAL UTILITY MAXIMIZATION [J].

CHUNG, KJ ;

SOBEL, MJ .

SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1987, 25 (01) :49-62

[7]

CORALUPPI SP, 1997, THESIS U MARYLAND

[8]

EAGLE J, 1975, THESIS STANFORD U

[9] SUBSTITUTION, RISK-AVERSION, AND THE TEMPORAL BEHAVIOR OF CONSUMPTION AND ASSET RETURNS - A THEORETICAL FRAMEWORK [J].

EPSTEIN, LG ;

ZIN, SE .

ECONOMETRICA, 1989, 57 (04) :937-969

[10] Risk-sensitive optimal control of hidden Markov models: Structural results [J].

FernandezGaucherand, E ;

Marcus, SI .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1997, 42 (10) :1418-1422

← 1 2 3 →