Risk-sensitive REINFORCE: A Monte Carlo Policy Gradient Algorithm for Exponential Performance Criteria

被引:3
|
作者
Noorani, Erfaun [1 ,2 ,3 ]
Baras, John S. [1 ,2 ]
机构
[1] Univ Maryland, Dept Elect & Comp Engn, College Pk, MD 20742 USA
[2] Univ Maryland, Inst Syst Res ISR, College Pk, MD 20742 USA
[3] Clark Sch Engn, College Pk, MD 20742 USA
来源
2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC) | 2021年
关键词
STOCHASTIC LINEAR-SYSTEMS; GAMES;
D O I
10.1109/CDC45484.2021.9683645
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Risk is an inherent component of any decision making process under uncertain conditions, and failure to consider risk may lead to significant performance degradation. We present a policy gradient theorem for the Risk-sensitive Control "exponential of integral" criteria, and propose a risk-sensitive Monte Carlo policy gradient algorithm. Our simulations, together with our theoretical analysis, show that the use of the exponential criteria with an appropriately chosen risk parameter not only results in a risk-sensitive policy, but also reduces variance during learning process and accelerates learning, which in turn results in a policy with higher expected return- that is to say, risk-sensitiveness leads to sample efficiency and improved performance.
引用
收藏
页码:1522 / 1527
页数:6
相关论文
共 2 条
  • [1] Embracing Risk in Reinforcement Learning: The Connection between Risk-Sensitive Exponential and Distributionally Robust Criteria
    Noorani, Erfaun
    Baras, John S.
    2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 2703 - 2708
  • [2] Discrete-time Decentralized Control using the Risk-sensitive Performance Criterion in the Large Population Regime: A Mean Field Approach
    Moon, Jun
    Basar, Tamer
    2015 AMERICAN CONTROL CONFERENCE (ACC), 2015, : 4779 - 4784