Regret Bounds for Risk-Sensitive Reinforcement Learning

被引:0
作者
Bastani, Osbert [1 ]
Ma, Yecheng Jason [1 ]
Shen, Estelle [1 ]
Xu, Wanqiao [2 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
[2] Stanford Univ, Stanford, CA USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022 | 2022年
关键词
VALUE-AT-RISK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In safety-critical applications of reinforcement learning such as healthcare and robotics, it is often desirable to optimize risk-sensitive objectives that account for tail outcomes rather than expected reward. We prove the first regret bounds for reinforcement learning under a general class of risk-sensitive objectives including the popular CVaR objective. Our theory is based on a novel characterization of the CVaR objective as well as a novel optimistic MDP construction.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Good-Deal Bounds for Option Prices under Value-at-Risk and Expected Shortfall Constraints
    Desmettre, Sascha
    Laudage, Christian
    Sass, Joern
    [J]. RISKS, 2020, 8 (04) : 1 - 22
  • [32] Monitoring bank risk around the world using unsupervised learning
    Mercadier, Mathieu
    Tarazi, Amine
    Armand, Paul
    Lardy, Jean-Pierre
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2025, 324 (02) : 590 - 615
  • [33] A scenario optimization approach to reliability-based and risk-based design: Soft-constrained modulation of failure probability bounds
    Rocchetta, Roberto
    Crespo, Luis G.
    [J]. RELIABILITY ENGINEERING & SYSTEM SAFETY, 2021, 216
  • [34] Using financial risk measures for analyzing generalization performance of machine learning models
    Takeda, Akiko
    Kanamori, Takafumi
    [J]. NEURAL NETWORKS, 2014, 57 : 29 - 38
  • [35] Learning Disturbances Online for Risk-Aware Control: Risk-Aware Flight with Less Than One Minute of Data
    Akella, Prithvi
    Wei, Skylar X.
    Burdick, Joel W.
    Ames, Aaron D.
    [J]. LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [36] How Banks' Value-at-Risk Disclosures Predict their Total and Priced Risk: Effects of Bank Technical Sophistication and Learning over Time
    Chi-chun Liu
    Stephen G. Ryan
    Hung Tan
    [J]. Review of Accounting Studies, 2004, 9 : 265 - 294
  • [37] How banks' value-at-risk disclosures predict their total and priced risk: Effects of bank technical sophistication and learning over time
    Liu, CC
    Ryan, SG
    Tan, H
    [J]. REVIEW OF ACCOUNTING STUDIES, 2004, 9 (2-3) : 265 - 294
  • [38] Financial Fraud Detection Using Value-at-Risk With Machine Learning in Skewed Data
    Usman, Abdullahi Ubale
    Abdullahi, Sunusi Bala
    Yu, Liping
    Alghofaily, Bayan
    Almasoud, Ahmed S.
    Rehman, Amjad
    [J]. IEEE ACCESS, 2024, 12 : 64285 - 64299
  • [39] Value at Risk Measurement Method under Deep Learning in Analysing the Excessive Financialization of Enterprises
    Shao, Bin-Tao
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2024, 40 (04) : 849 - 863
  • [40] Discussion of "how banks' value-at-risk disclosures predid their total and priced risk: Effects of bank technical sophistication and learning over time"
    Ke, B
    [J]. REVIEW OF ACCOUNTING STUDIES, 2004, 9 (2-3) : 295 - 299