Regret Bounds for Risk-Sensitive Reinforcement Learning

被引：0

作者：

Bastani, Osbert ^{[1
]}

Ma, Yecheng Jason ^{[1
]}

Shen, Estelle ^{[1
]}

Xu, Wanqiao ^{[2
]}

机构：

[1] Univ Penn, Philadelphia, PA 19104 USA

[2] Stanford Univ, Stanford, CA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022 | 2022年

关键词：

VALUE-AT-RISK;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In safety-critical applications of reinforcement learning such as healthcare and robotics, it is often desirable to optimize risk-sensitive objectives that account for tail outcomes rather than expected reward. We prove the first regret bounds for reinforcement learning under a general class of risk-sensitive objectives including the popular CVaR objective. Our theory is based on a novel characterization of the CVaR objective as well as a novel optimistic MDP construction.

引用

页数：11

共 50 条

[21] Upper bounds for strictly concave distortion risk measures on moment spaces
Cornilly, D.
Rueschendorf, L.
Vanduffel, S.
INSURANCE MATHEMATICS & ECONOMICS, 2018, 82 : 141 - 151
[22] Value-at-Risk computation by Fourier inversion with explicit error bounds
Siven, Johannes Vitalis
Lins, Jeffrey Todd
Szymkowiak-Have, Anna
FINANCE RESEARCH LETTERS, 2009, 6 (02) : 95 - 105
[23] Reduction of Value-at-Risk bounds via independence and variance information
Puccetti, Giovanni
Russchendorf, Ludger
Small, Daniel
Vanduffel, Steven
SCANDINAVIAN ACTUARIAL JOURNAL, 2017, (03) : 245 - 266
[24] Value-at-Risk bounds with two-sided dependence information
Lux, Thibaut
Rueschendorf, Ludger
MATHEMATICAL FINANCE, 2019, 29 (03) : 967 - 1000
[25] Concentration bounds for empirical conditional value-at-risk: The unbounded case
Kolla, Ravi Kumar
Prashanth, L. A.
Bhat, Sanjay P.
Jagannathan, Krishna
OPERATIONS RESEARCH LETTERS, 2019, 47 (01) : 16 - 20
[26] Range Value-at-Risk bounds for unimodal distributions under partial information
Bernard, Carole
Kazzi, Rodrigue
Vanduffel, Steven
INSURANCE MATHEMATICS & ECONOMICS, 2020, 94 : 9 - 24
[27] Bounds for the sum of dependent risks and worst Value-at-Risk with monotone marginal densities
Ruodu Wang
Liang Peng
Jingping Yang
Finance and Stochastics, 2013, 17 : 395 - 417
[28] Bounds for the sum of dependent risks and worst Value-at-Risk with monotone marginal densities
Wang, Ruodu
Peng, Liang
Yang, Jingping
FINANCE AND STOCHASTICS, 2013, 17 (02) : 395 - 417
[29] Concentration inequality of sums of dependent subexponential random variables and application to bounds for value-at-risk
Tanoue, Yuta
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2024, 53 (09) : 3123 - 3142
[30] Encoded Value-at-Risk: A machine learning approach for portfolio risk measurement
Arian, Hamid
Moghimi, Mehrdad
Tabatabaei, Ehsan
Zamani, Shiva
MATHEMATICS AND COMPUTERS IN SIMULATION, 2022, 202 : 500 - 525

← 1 2 3 4 5 →