Markov Decision Processes with Average-Value-at-Risk criteria

被引:88
作者
Baeuerle, Nicole [1 ]
Ott, Jonathan [1 ]
机构
[1] Karlsruhe Inst Technol, Inst Stochast, D-76128 Karlsruhe, Germany
关键词
Markov Decision Problem; Average-Value-at-Risk; Time-consistency; Risk aversion; TIME; OPTIMIZATION; VARIANCE;
D O I
10.1007/s00186-011-0367-0
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
We investigate the problem of minimizing the Average-Value-at-Risk (AVaR(tau)) of the discounted cost over a finite and an infinite horizon which is generated by a Markov Decision Process (MDP). We show that this problem can be reduced to an ordinary MDP with extended state space and give conditions under which an optimal policy exists. We also give a time-consistent interpretation of the AVaR(tau). At the end we consider a numerical example which is a simple repeated casino game. It is used to discuss the influence of the risk aversion parameter tau of the AVaR(tau)-criterion.
引用
收藏
页码:361 / 379
页数:19
相关论文
共 18 条
[1]   On the coherence of expected shortfall [J].
Acerbi, C ;
Tasche, D .
JOURNAL OF BANKING & FINANCE, 2002, 26 (07) :1487-1503
[2]  
[Anonymous], GEN THEORY MARKOVIAN
[3]   Coherent multiperiod risk adjusted values and Bellman's principle [J].
Artzner, Philippe ;
Delbaen, Freddy ;
Eber, Jean-Marc ;
Heath, David ;
Ku, Hyejin .
ANNALS OF OPERATIONS RESEARCH, 2007, 152 (1) :5-22
[4]  
Bäuerle N, 2011, UNIVERSITEXT, P1, DOI 10.1007/978-3-642-18324-9
[5]   Dynamic mean-risk optimization in a binomial model [J].
Baeuerle, Nicole ;
Mundt, Andre .
MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2009, 70 (02) :219-239
[6]   Dynamic risk measures: Time consistency and risk measures from BMO martingales [J].
Bion-Nadal, Jocelyne .
FINANCE AND STOCHASTICS, 2008, 12 (02) :219-244
[7]   Time consistent dynamic risk measures [J].
Boda, K ;
Filar, JA .
MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2006, 63 (01) :169-186
[8]   Stochastic target hitting time and the problem of early retirement [J].
Boda, K ;
Filar, JA ;
Lin, YL ;
Spanjers, L .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2004, 49 (03) :409-419
[9]   Finite-horizon dynamic optimisation when the terminal reward is a concave functional of the distribution of the final state [J].
Collins, EJ ;
McNamara, JM .
ADVANCES IN APPLIED PROBABILITY, 1998, 30 (01) :122-136
[10]   RISK-SENSITIVE MARKOV DECISION PROCESSES [J].
HOWARD, RA ;
MATHESON, JE .
MANAGEMENT SCIENCE SERIES A-THEORY, 1972, 18 (07) :356-369