Regret Bounds for Risk-Sensitive Reinforcement Learning

被引:0
作者
Bastani, Osbert [1 ]
Ma, Yecheng Jason [1 ]
Shen, Estelle [1 ]
Xu, Wanqiao [2 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
[2] Stanford Univ, Stanford, CA USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022 | 2022年
关键词
VALUE-AT-RISK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In safety-critical applications of reinforcement learning such as healthcare and robotics, it is often desirable to optimize risk-sensitive objectives that account for tail outcomes rather than expected reward. We prove the first regret bounds for reinforcement learning under a general class of risk-sensitive objectives including the popular CVaR objective. Our theory is based on a novel characterization of the CVaR objective as well as a novel optimistic MDP construction.
引用
收藏
页数:11
相关论文
共 50 条
[41]   Discussion of “How Banks' Value-at-Risk Disclosures Predict their Total and Priced Risk: Effects of Bank Technical Sophistication and Learning over Time” [J].
Bin Ke .
Review of Accounting Studies, 2004, 9 :295-299
[42]   Toward Improving the Distributional Robustness of Risk-Aware Controllers in Learning-Enabled Environments [J].
Hakobyan, Astghik ;
Yang, Insoon .
2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, :6024-6031
[43]   Value-at-Risk forecasting: A hybrid ensemble learning GARCH-LSTM based approach [J].
Kakade, Kshitij ;
Jain, Ishan ;
Mishra, Aswini Kumar .
RESOURCES POLICY, 2022, 78
[44]   Investment risk forecasting model using extreme value theory approach combined with machine learning [J].
Melina, Melina ;
Sukono ;
Napitupulu, Herlina ;
Mohamed, Norizan .
AIMS MATHEMATICS, 2024, 9 (11) :33314-33352
[45]   A Novel Modeling Technique for the Forecasting of Multiple-Asset Trading Volumes: Innovative Initial-Value-Problem Differential Equation Algorithms for Reinforcement Machine Learning [J].
Al Janabi, Mazin A. M. .
COMPLEXITY, 2022, 2022
[46]   Deep learning of value at risk through generative neural network models: The case of the Variational auto encoder [J].
Brugiere, Pierre ;
Turinici, Gabriel .
METHODSX, 2023, 10
[47]   To assess the multiperiod market risk with deep learning method taking the boosting additive quantile regression as an example [J].
Guan, Min .
COMPUTATIONAL INTELLIGENCE, 2022, 38 (01) :216-228
[48]   Forecasting Bitcoin Volatility and Value-at-Risk Using Stacking Machine Learning Models With Intraday Data [J].
Pourrezaee, Arash ;
Hajizadeh, Ehsan .
COMPUTATIONAL ECONOMICS, 2024,
[49]   A Novel Data Driven Machine Learning Algorithm For Fuzzy Estimates of Optimal Portfolio Weights and Risk Tolerance Coefficient [J].
Thavaneswaran, Aerambamoorthy ;
Liang, You ;
Paseka, Alex ;
Hoque, Md Erfanul ;
Thulasiram, Ruppa K. .
IEEE CIS INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS 2021 (FUZZ-IEEE), 2021,
[50]   Risk-Based Robust Statistical Learning by Stochastic Difference-of-Convex Value-Function Optimization [J].
Liu, Junyi ;
Pang, Jong-Shi .
OPERATIONS RESEARCH, 2023, 71 (02) :397-414