Safe and Efficient Reinforcement Learning Using Disturbance-Observer-Based Control Barrier Functions

被引：0

作者：

Cheng, Yikun ^{[1
]}

Zhao, Pan ^{[1
]}

Hovakimyan, Naira ^{[1
]}

机构：

[1] Univ Illinois, Urbana, IL 61801 USA

来源：

LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211 | 2023年 / 211卷

关键词：

Reinforcement learning; robot safety; robust control; uncertainty estimation;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Safe reinforcement learning (RL) with assured satisfaction of hard state constraints during training has recently received a lot of attention. Safety filters, e.g., based on control barrier functions (CBFs), provide a promising way for safe RL via modifying the unsafe actions of an RL agent on the fly. Existing safety filter-based approaches typically involve learning of uncertain dynamics and quantifying the learned model error, which leads to conservative filters before a large amount of data is collected to learn a good model, thereby preventing efficient exploration. This paper presents a method for safe and efficient RL using disturbance observers (DOBs) and control barrier functions (CBFs). Unlike most existing safe RL methods that deal with hard state constraints, our method does not involve model learning, and leverages DOBs to accurately estimate the pointwise value of the uncertainty, which is then incorporated into a robust CBF condition to generate safe actions. The DOB-based CBF can be used as a safety filter with model-free RL algorithms by minimally modifying the actions of an RL agent whenever necessary to ensure safety throughout the learning process. Simulation results on a unicycle and a 2D quadrotor demonstrate that the proposed method outperforms a state-of-the-art safe RL algorithm using CBFs and Gaussian processes-based model learning, in terms of safety violation rate, and sample and computational efficiency.

引用

收藏

页数：12

相关论文

共 50 条

[21] Adaptive Sliding Mode Disturbance Observer and Deep Reinforcement Learning Based Motion Control for Micropositioners [J].

Liang, Shiyun ;

Xi, Ruidong ;

Xiao, Xiao ;

Yang, Zhixin .

MICROMACHINES, 2022, 13 (03)

[22] Adaptive Design Parameter Determination for Control Barrier Functions using Reinforcement Learning [J].

Memis, Sezer ;

Demir, Esra ;

Senel, Serkan ;

Demir, Mustafa ;

Koyuncu, Emre .

IFAC PAPERSONLINE, 2024, 58 (30) :186-191

[23] Safe, efficient, and comfortable velocity control based on reinforcement learning for autonomous driving [J].

Zhu, Meixin ;

Wang, Yinhai ;

Pu, Ziyuan ;

Hu, Jingyun ;

Wang, Xuesong ;

Ke, Ruimin .

TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2020, 117

[24] Fully cooperative games with state and input constraints using reinforcement learning based on control barrier functions [J].

Liu, Shihan ;

Liu, Lijun ;

Yu, Zhen .

ASIAN JOURNAL OF CONTROL, 2024, 26 (02) :888-905

[25] DISTURBANCE-OBSERVER-BASED RELIABLE OUTPUT CONTROL FOR TIME-DELAY SYSTEMS WITH ACTUATOR FAULTS [J].

Zhou, Qiwei ;

Zhang, Guangming ;

Yan, Shen .

INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2019, 15 (02) :629-640

[26] Disturbance-observer-based Neural Sliding Mode Repetitive Learning Control of Hydraulic Rehabilitation Exoskeleton Knee Joint with Input Saturation [J].

Yang, Yong ;

Dong, Xiu-Cheng ;

Wu, Zu-Quan ;

Liu, Xia ;

Huang, De-Qing .

INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2022, 20 (12) :4026-4036

[27] Safe-Control-Gym: A Unified Benchmark Suite for Safe Learning-Based Control and Reinforcement Learning in Robotics [J].

Yuan, Zhaocong ;

Hall, Adam W. ;

Zhou, Siqi ;

Brunke, Lukas ;

Greeff, Melissa ;

Panerati, Jacopo ;

Schoellig, Angela P. .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) :11142-11149

[28] Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning [J].

Brunke, Lukas ;

Greeff, Melissa ;

Hall, Adam W. ;

Yuan, Zhaocong ;

Zhou, Siqi ;

Panerati, Jacopo ;

Schoellig, Angela P. .

ANNUAL REVIEW OF CONTROL ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 5 :411-444

[29] Safe Reinforcement Learning using Data-Driven Predictive Control [J].

Selim, Mahmoud ;

Alanwar, Amr ;

El-Kharashi, M. Watheq ;

Abbas, Hazem M. ;

Johansson, Karl H. .

2022 5TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, SIGNAL PROCESSING, AND THEIR APPLICATIONS (ICCSPA), 2022,

[30] Ensuring Safety of Learning-Based Motion Planners Using Control Barrier Functions [J].

Wang, Xiao .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) :4773-4780

← 1 2 3 4 5 →