Constrained reinforcement learning with statewise projection: a control barrier function approach

被引:1
作者
Jin, Xinze [1 ]
Li, Kuo [1 ]
Jia, Qingshan [1 ]
机构
[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
reinforcement learning; safe projection; control barrier function; SAFETY; FRAMEWORK;
D O I
10.1007/s11432-023-3872-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Safety is a critical issue for reinforcement learning (RL), as it may be risky for some actual applications if the learning process involves unsafe exploration. Instead of formulating constraints as expectation-based in constrained RL, considering statewise safety in constrained RL is more meaningful. This work aims to address the issue of safe projection in RL by introducing a control barrier function that inherently learns a safe policy through a set certificate. We seek to analyze some theoretical properties of safe projection in the learning process, including convergence and performance bound, and extend the discussion into ensembles and guided controllers. Moreover, we approach analytical solutions for deterministic and stochastic system dynamics. Experimental results in different tasks show that the proposed method achieves better effects in terms of both performance and safety.
引用
收藏
页数:19
相关论文
共 32 条
  • [1] Achiam J, 2017, PR MACH LEARN RES, V70
  • [2] Control Barrier Function Based Quadratic Programs for Safety Critical Systems
    Ames, Aaron D.
    Xu, Xiangru
    Grizzle, Jessy W.
    Tabuada, Paulo
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (08) : 3861 - 3876
  • [3] Amos B, 2017, PR MACH LEARN RES, V70
  • [4] Balakrishna A, 2020, P C ROBOT LEARNING, P24
  • [5] Brown DS, 2019, PR MACH LEARN RES, V97
  • [6] Cheng R, 2019, AAAI CONF ARTIF INTE, P3387
  • [7] Adaptive state-feedback stabilization of state-constrained stochastic high-order nonlinear systems
    Cui, Rongheng
    Xie, Xuejun
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (10)
  • [8] A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems
    Fisac, Jaime F.
    Akametalu, Anayo K.
    Zeilinger, Melanie N.
    Kaynama, Shahab
    Gillula, Jeremy
    Tomlin, Claire J.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (07) : 2737 - 2752
  • [9] García J, 2015, J MACH LEARN RES, V16, P1437
  • [10] Control for Safety Specifications of Systems With Imperfect Information on a Partial Order
    Ghaemi, Reza
    Del Vecchio, Domitilla
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (04) : 982 - 995