Robust Risk-Sensitive Reinforcement Learning with Conditional Value-at-Risk

被引:0
作者
Ni, Xinyi [1 ]
Lai, Lifeng [1 ]
机构
[1] Univ Calif Davis, Elect & Comp Engn, Davis, CA USA
来源
2024 IEEE INFORMATION THEORY WORKSHOP, ITW 2024 | 2024年
基金
美国国家科学基金会;
关键词
ambiguity sets; RMDP; risk-sensitive RL; CVaR; OPTIMIZATION;
D O I
10.1109/ITW61385.2024.10806953
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Robust Markov Decision Processes (RMDPs) have received significant research interest, offering an alternative to standard Markov Decision Processes (MDPs) that often assume fixed transition probabilities. RMDPs address this by optimizing for the worst-case scenarios within ambiguity sets. While earlier studies on RMDPs have largely centered on risk-neutral reinforcement learning (RL), with the goal of minimizing expected total discounted costs, in this paper, we analyze the robustness of CVaR-based risk-sensitive RL under RMDP. Firstly, we consider predetermined ambiguity sets. Based on the coherency of CVaR, we establish a connection between robustness and risk sensitivity, thus, techniques in risk-sensitive RL can be adopted to solve the proposed problem. Furthermore, motivated by the existence of decision-dependent uncertainty in real-world problems, we study problems with state-action-dependent ambiguity sets. To solve this, we define a new risk measure named NCVaR and build the equivalence of NCVaR optimization and robust CVaR optimization. We further propose value iteration algorithms and validate our approach in simulation experiments.
引用
收藏
页码:520 / 525
页数:6
相关论文
共 50 条
  • [21] Policy Gradient Based Entropic-VaR Optimization in Risk-Sensitive Reinforcement Learning
    Ni, Xinyi
    Lai, Lifeng
    [J]. 2022 58TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2022,
  • [22] Risk-Aware Stochastic Ship Routing using Conditional Value-at-Risk
    Nunez, Andre
    Kong, Felix H.
    Gonzalez-Cantos, Alberto
    Fitch, Robert
    [J]. 2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 10543 - 10550
  • [23] Constrained Risk-Sensitive Deep Reinforcement Learning for eMBB-URLLC Joint Scheduling
    Zhang, Wenheng
    Derakhshani, Mahsa
    Zheng, Gan
    Lambotharan, Sangarapillai
    [J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (09) : 10608 - 10624
  • [24] A robust multilayer extreme learning machine using kernel risk-sensitive loss criterion
    Luo, Xiong
    Li, Ying
    Wang, Weiping
    Ban, Xiaojuan
    Wang, Jenq-Haur
    Zhao, Wenbing
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (01) : 197 - 216
  • [25] Robust scenario-based value-at-risk optimization
    Oleksandr Romanko
    Helmut Mausser
    [J]. Annals of Operations Research, 2016, 237 : 203 - 218
  • [26] Robust scenario-based value-at-risk optimization
    Romanko, Oleksandr
    Mausser, Helmut
    [J]. ANNALS OF OPERATIONS RESEARCH, 2016, 237 (1-2) : 203 - 218
  • [27] An Online Risk Based Security Assessment Via Conditional Value-at-risk in Uncertain Environment
    Deng, Wei-si
    Wu, Jia-si
    Zhang, Bu-han
    Ding, Hong-fa
    [J]. 2016 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND AUTOMATION (ICEEA 2016), 2016,
  • [28] Cooperative Dispatch of Renewable-Penetrated Microgrids Alliances Using Risk-Sensitive Reinforcement Learning
    Zhu, Ziqing
    Gao, Xiang
    Bu, Siqi
    Chan, Ka Wing
    Zhou, Bin
    Xia, Shiwei
    [J]. IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, 2024, 15 (04) : 2194 - 2208
  • [29] Investigating the agricultural losses due to climate variability: An application of conditional value-at-risk approach
    Sukono
    Riaman
    Supian, Sudradjat
    Hidayat, Yuyun
    Saputra, Jumadil
    Pribadi, Diantiny Mariam
    [J]. DECISION SCIENCE LETTERS, 2021, 10 (01) : 71 - 78
  • [30] Index tracking and enhanced indexing using mixed conditional value-at-risk
    Goel, Anubha
    Sharma, Amita
    Mehra, Aparna
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2018, 335 : 361 - 380