Constrained reinforcement learning with statewise projection: a control barrier function approach

被引：1

作者：

Jin, Xinze ^{[1
]}

Li, Kuo ^{[1
]}

Jia, Qingshan ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China

来源：

SCIENCE CHINA-INFORMATION SCIENCES | 2024年 / 67卷 / 03期

基金：

中国国家自然科学基金;

关键词：

reinforcement learning; safe projection; control barrier function; SAFETY; FRAMEWORK;

D O I：

10.1007/s11432-023-3872-9

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Safety is a critical issue for reinforcement learning (RL), as it may be risky for some actual applications if the learning process involves unsafe exploration. Instead of formulating constraints as expectation-based in constrained RL, considering statewise safety in constrained RL is more meaningful. This work aims to address the issue of safe projection in RL by introducing a control barrier function that inherently learns a safe policy through a set certificate. We seek to analyze some theoretical properties of safe projection in the learning process, including convergence and performance bound, and extend the discussion into ensembles and guided controllers. Moreover, we approach analytical solutions for deterministic and stochastic system dynamics. Experimental results in different tasks show that the proposed method achieves better effects in terms of both performance and safety.

引用

页数：19

共 32 条

[1] Achiam J, 2017, PR MACH LEARN RES, V70
[2] Control Barrier Function Based Quadratic Programs for Safety Critical Systems
Ames, Aaron D.
Xu, Xiangru
Grizzle, Jessy W.
Tabuada, Paulo
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (08) : 3861 - 3876
[3] Amos B, 2017, PR MACH LEARN RES, V70
[4] Balakrishna A, 2020, P C ROBOT LEARNING, P24
[5] Brown DS, 2019, PR MACH LEARN RES, V97
[6] Cheng R, 2019, AAAI CONF ARTIF INTE, P3387
[7] Adaptive state-feedback stabilization of state-constrained stochastic high-order nonlinear systems
Cui, Rongheng
Xie, Xuejun
[J]. SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (10)
[8] A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems
Fisac, Jaime F.
Akametalu, Anayo K.
Zeilinger, Melanie N.
Kaynama, Shahab
Gillula, Jeremy
Tomlin, Claire J.
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (07) : 2737 - 2752
[9] García J, 2015, J MACH LEARN RES, V16, P1437
[10] Control for Safety Specifications of Systems With Imperfect Information on a Partial Order
Ghaemi, Reza
Del Vecchio, Domitilla
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (04) : 982 - 995

← 1 2 3 4 →