Robust Risk-Sensitive Reinforcement Learning with Conditional Value-at-Risk

被引：0

作者：

Ni, Xinyi ^{[1
]}

Lai, Lifeng ^{[1
]}

机构：

[1] Univ Calif Davis, Elect & Comp Engn, Davis, CA USA

来源：

2024 IEEE INFORMATION THEORY WORKSHOP, ITW 2024 | 2024年

基金：

美国国家科学基金会;

关键词：

ambiguity sets; RMDP; risk-sensitive RL; CVaR; OPTIMIZATION;

D O I：

10.1109/ITW61385.2024.10806953

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Robust Markov Decision Processes (RMDPs) have received significant research interest, offering an alternative to standard Markov Decision Processes (MDPs) that often assume fixed transition probabilities. RMDPs address this by optimizing for the worst-case scenarios within ambiguity sets. While earlier studies on RMDPs have largely centered on risk-neutral reinforcement learning (RL), with the goal of minimizing expected total discounted costs, in this paper, we analyze the robustness of CVaR-based risk-sensitive RL under RMDP. Firstly, we consider predetermined ambiguity sets. Based on the coherency of CVaR, we establish a connection between robustness and risk sensitivity, thus, techniques in risk-sensitive RL can be adopted to solve the proposed problem. Furthermore, motivated by the existence of decision-dependent uncertainty in real-world problems, we study problems with state-action-dependent ambiguity sets. To solve this, we define a new risk measure named NCVaR and build the equivalence of NCVaR optimization and robust CVaR optimization. We further propose value iteration algorithms and validate our approach in simulation experiments.

引用

页码：520 / 525

页数：6

共 50 条

[21] Policy Gradient Based Entropic-VaR Optimization in Risk-Sensitive Reinforcement Learning
Ni, Xinyi
Lai, Lifeng
[J]. 2022 58TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2022,
[22] Risk-Aware Stochastic Ship Routing using Conditional Value-at-Risk
Nunez, Andre
Kong, Felix H.
Gonzalez-Cantos, Alberto
Fitch, Robert
[J]. 2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 10543 - 10550
[23] Constrained Risk-Sensitive Deep Reinforcement Learning for eMBB-URLLC Joint Scheduling
Zhang, Wenheng
Derakhshani, Mahsa
Zheng, Gan
Lambotharan, Sangarapillai
[J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (09) : 10608 - 10624
[24] A robust multilayer extreme learning machine using kernel risk-sensitive loss criterion
Luo, Xiong
Li, Ying
Wang, Weiping
Ban, Xiaojuan
Wang, Jenq-Haur
Zhao, Wenbing
[J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (01) : 197 - 216
[25] Robust scenario-based value-at-risk optimization
Oleksandr Romanko
Helmut Mausser
[J]. Annals of Operations Research, 2016, 237 : 203 - 218
[26] Robust scenario-based value-at-risk optimization
Romanko, Oleksandr
Mausser, Helmut
[J]. ANNALS OF OPERATIONS RESEARCH, 2016, 237 (1-2) : 203 - 218
[27] An Online Risk Based Security Assessment Via Conditional Value-at-risk in Uncertain Environment
Deng, Wei-si
Wu, Jia-si
Zhang, Bu-han
Ding, Hong-fa
[J]. 2016 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND AUTOMATION (ICEEA 2016), 2016,
[28] Cooperative Dispatch of Renewable-Penetrated Microgrids Alliances Using Risk-Sensitive Reinforcement Learning
Zhu, Ziqing
Gao, Xiang
Bu, Siqi
Chan, Ka Wing
Zhou, Bin
Xia, Shiwei
[J]. IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, 2024, 15 (04) : 2194 - 2208
[29] Investigating the agricultural losses due to climate variability: An application of conditional value-at-risk approach
Sukono
Riaman
Supian, Sudradjat
Hidayat, Yuyun
Saputra, Jumadil
Pribadi, Diantiny Mariam
[J]. DECISION SCIENCE LETTERS, 2021, 10 (01) : 71 - 78
[30] Index tracking and enhanced indexing using mixed conditional value-at-risk
Goel, Anubha
Sharma, Amita
Mehra, Aparna
[J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2018, 335 : 361 - 380

← 1 2 3 4 5 →