Lyapunov Design for Robust and Efficient Robotic Reinforcement Learning

被引：0

作者：

Westenbroek, Tyler ^{[1
]}

Castaneda, Fernando ^{[2
]}

Agrawal, Ayush ^{[2
]}

Sastry, Shankar ^{[1
]}

Sreenath, Koushil ^{[2
]}

机构：

[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA

[2] Univ Calif Berkeley, Dept Mech Engn, Berkeley, CA USA

来源：

CONFERENCE ON ROBOT LEARNING, VOL 205 | 2022年 / 205卷

关键词：

MODEL-PREDICTIVE CONTROL; RECEDING-HORIZON CONTROL; STABILITY; SYSTEMS; STABILIZATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent advances in the reinforcement learning (RL) literature have enabled roboticists to automatically train complex policies in simulated environments. However, due to the poor sample complexity of these methods, solving RL problems using real-world data remains a challenging problem. This paper introduces a novel cost-shaping method which aims to reduce the number of samples needed to learn a stabilizing controller. The method adds a term involving a Control Lyapunov Function (CLF) - an 'energy-like' function from the model-based control literature - to typical cost formulations. Theoretical results demonstrate the new costs lead to stabilizing controllers when smaller discount factors are used, which is well-known to reduce sample complexity. Moreover, the addition of the CLF term 'robustifies' the search for a stabilizing controller by ensuring that even highly sub-optimal polices will stabilize the system. We demonstrate our approach with two hardware examples where we learn stabilizing controllers for a cartpole and an A1 quadruped with only seconds and a few minutes of finetuning data, respectively. Furthermore, simulation benchmark studies show that obtaining stabilizing policies by optimizing our proposed costs requires orders of magnitude less data compared to standard cost designs. [GRAPHICS] .

引用

页码：2125 / 2135

页数：11

共 50 条

[1] Lyapunov design for safe reinforcement learning
Perkins, TJ
Barto, AG
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 803 - 832
[2] Linguistic Lyapunov reinforcement learning control for robotic manipulators
Kumar, Abhishek
Sharma, Rajneesh
NEUROCOMPUTING, 2018, 272 : 84 - 95
[3] Efficient Spatiotemporal Transformer for Robotic Reinforcement Learning
Yang, Yiming
Xing, Dengpeng
Xu, Bo
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) : 7982 - 7989
[4] Adaptive and Robust Network Routing Based on Deep Reinforcement Learning with Lyapunov Optimization
Zhuang, Zirui
Wang, Jingyu
Qi, Qi
Liao, Jianxin
Han, Zhu
2020 IEEE/ACM 28TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2020,
[5] Sample-efficient Reinforcement Learning in Robotic Table Tennis
Tebbe, Jonas
Krauch, Lukas
Gao, Yapeng
Zell, Andreas
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4171 - 4178
[6] Energy Efficient Edge Computing: When Lyapunov Meets Distributed Reinforcement Learning
Sana, Mohamed
Merluzzi, Mattia
di Pietro, Nicola
Strinati, Emilio Calvanese
2021 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS (ICC WORKSHOPS), 2021,
[7] Reinforcement Learning for Robust Trajectory Design of Interplanetary Missions
Zavoli, Alessandro
Federici, Lorenzo
JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2021, 44 (08) : 1440 - 1453
[8] Estimating Lyapunov Region of Attraction for Robust Model-Based Reinforcement Learning USV
Xia, Lei
Cui, Yunduan
Yi, Zhengkun
Li, Huiyun
Wu, Xinyu
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 : 8898 - 8911
[9] Robust Speed Control of Ultrasonic Motors Based on Deep Reinforcement Learning of a Lyapunov Function
Mustafa, Abdullah
Sasamura, Tatsuki
Morita, Takeshi
IEEE ACCESS, 2022, 10 : 46895 - 46910
[10] Robust and efficient task scheduling for robotics applications with reinforcement learning
Tejer, Mateusz
Szczepanski, Rafal
Tarczewski, Tomasz
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127

← 1 2 3 4 5 →