Regret Bounds for Risk-Sensitive Reinforcement Learning

被引：0

作者：

Bastani, Osbert ^{[1
]}

Ma, Yecheng Jason ^{[1
]}

Shen, Estelle ^{[1
]}

Xu, Wanqiao ^{[2
]}

机构：

[1] Univ Penn, Philadelphia, PA 19104 USA

[2] Stanford Univ, Stanford, CA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022 | 2022年

关键词：

VALUE-AT-RISK;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In safety-critical applications of reinforcement learning such as healthcare and robotics, it is often desirable to optimize risk-sensitive objectives that account for tail outcomes rather than expected reward. We prove the first regret bounds for reinforcement learning under a general class of risk-sensitive objectives including the popular CVaR objective. Our theory is based on a novel characterization of the CVaR objective as well as a novel optimistic MDP construction.

引用

收藏

页数：11

相关论文

共 50 条

[41] Discussion of “How Banks' Value-at-Risk Disclosures Predict their Total and Priced Risk: Effects of Bank Technical Sophistication and Learning over Time” [J].

Bin Ke .

Review of Accounting Studies, 2004, 9 :295-299

[42] Toward Improving the Distributional Robustness of Risk-Aware Controllers in Learning-Enabled Environments [J].

Hakobyan, Astghik ;

Yang, Insoon .

2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, :6024-6031

[43] Value-at-Risk forecasting: A hybrid ensemble learning GARCH-LSTM based approach [J].

Kakade, Kshitij ;

Jain, Ishan ;

Mishra, Aswini Kumar .

RESOURCES POLICY, 2022, 78

[44] Investment risk forecasting model using extreme value theory approach combined with machine learning [J].

Melina, Melina ;

Sukono ;

Napitupulu, Herlina ;

Mohamed, Norizan .

AIMS MATHEMATICS, 2024, 9 (11) :33314-33352

[45] A Novel Modeling Technique for the Forecasting of Multiple-Asset Trading Volumes: Innovative Initial-Value-Problem Differential Equation Algorithms for Reinforcement Machine Learning [J].

Al Janabi, Mazin A. M. .

COMPLEXITY, 2022, 2022

[46] Deep learning of value at risk through generative neural network models: The case of the Variational auto encoder [J].

Brugiere, Pierre ;

Turinici, Gabriel .

METHODSX, 2023, 10

[47] To assess the multiperiod market risk with deep learning method taking the boosting additive quantile regression as an example [J].

Guan, Min .

COMPUTATIONAL INTELLIGENCE, 2022, 38 (01) :216-228

[48] Forecasting Bitcoin Volatility and Value-at-Risk Using Stacking Machine Learning Models With Intraday Data [J].

Pourrezaee, Arash ;

Hajizadeh, Ehsan .

COMPUTATIONAL ECONOMICS, 2024,

[49] A Novel Data Driven Machine Learning Algorithm For Fuzzy Estimates of Optimal Portfolio Weights and Risk Tolerance Coefficient [J].

Thavaneswaran, Aerambamoorthy ;

Liang, You ;

Paseka, Alex ;

Hoque, Md Erfanul ;

Thulasiram, Ruppa K. .

IEEE CIS INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS 2021 (FUZZ-IEEE), 2021,

[50] Risk-Based Robust Statistical Learning by Stochastic Difference-of-Convex Value-Function Optimization [J].

Liu, Junyi ;

Pang, Jong-Shi .

OPERATIONS RESEARCH, 2023, 71 (02) :397-414

← 1 2 3 4 5 →