Lyapunov Design for Robust and Efficient Robotic Reinforcement Learning

被引:0
|
作者
Westenbroek, Tyler [1 ]
Castaneda, Fernando [2 ]
Agrawal, Ayush [2 ]
Sastry, Shankar [1 ]
Sreenath, Koushil [2 ]
机构
[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Dept Mech Engn, Berkeley, CA USA
来源
CONFERENCE ON ROBOT LEARNING, VOL 205 | 2022年 / 205卷
关键词
MODEL-PREDICTIVE CONTROL; RECEDING-HORIZON CONTROL; STABILITY; SYSTEMS; STABILIZATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advances in the reinforcement learning (RL) literature have enabled roboticists to automatically train complex policies in simulated environments. However, due to the poor sample complexity of these methods, solving RL problems using real-world data remains a challenging problem. This paper introduces a novel cost-shaping method which aims to reduce the number of samples needed to learn a stabilizing controller. The method adds a term involving a Control Lyapunov Function (CLF) - an 'energy-like' function from the model-based control literature - to typical cost formulations. Theoretical results demonstrate the new costs lead to stabilizing controllers when smaller discount factors are used, which is well-known to reduce sample complexity. Moreover, the addition of the CLF term 'robustifies' the search for a stabilizing controller by ensuring that even highly sub-optimal polices will stabilize the system. We demonstrate our approach with two hardware examples where we learn stabilizing controllers for a cartpole and an A1 quadruped with only seconds and a few minutes of finetuning data, respectively. Furthermore, simulation benchmark studies show that obtaining stabilizing policies by optimizing our proposed costs requires orders of magnitude less data compared to standard cost designs. [GRAPHICS] .
引用
收藏
页码:2125 / 2135
页数:11
相关论文
共 50 条
  • [1] Lyapunov design for safe reinforcement learning
    Perkins, TJ
    Barto, AG
    JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 803 - 832
  • [2] Linguistic Lyapunov reinforcement learning control for robotic manipulators
    Kumar, Abhishek
    Sharma, Rajneesh
    NEUROCOMPUTING, 2018, 272 : 84 - 95
  • [3] Efficient Spatiotemporal Transformer for Robotic Reinforcement Learning
    Yang, Yiming
    Xing, Dengpeng
    Xu, Bo
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) : 7982 - 7989
  • [4] Adaptive and Robust Network Routing Based on Deep Reinforcement Learning with Lyapunov Optimization
    Zhuang, Zirui
    Wang, Jingyu
    Qi, Qi
    Liao, Jianxin
    Han, Zhu
    2020 IEEE/ACM 28TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2020,
  • [5] Sample-efficient Reinforcement Learning in Robotic Table Tennis
    Tebbe, Jonas
    Krauch, Lukas
    Gao, Yapeng
    Zell, Andreas
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4171 - 4178
  • [6] Energy Efficient Edge Computing: When Lyapunov Meets Distributed Reinforcement Learning
    Sana, Mohamed
    Merluzzi, Mattia
    di Pietro, Nicola
    Strinati, Emilio Calvanese
    2021 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS (ICC WORKSHOPS), 2021,
  • [7] Reinforcement Learning for Robust Trajectory Design of Interplanetary Missions
    Zavoli, Alessandro
    Federici, Lorenzo
    JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2021, 44 (08) : 1440 - 1453
  • [8] Estimating Lyapunov Region of Attraction for Robust Model-Based Reinforcement Learning USV
    Xia, Lei
    Cui, Yunduan
    Yi, Zhengkun
    Li, Huiyun
    Wu, Xinyu
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 : 8898 - 8911
  • [9] Robust Speed Control of Ultrasonic Motors Based on Deep Reinforcement Learning of a Lyapunov Function
    Mustafa, Abdullah
    Sasamura, Tatsuki
    Morita, Takeshi
    IEEE ACCESS, 2022, 10 : 46895 - 46910
  • [10] Robust and efficient task scheduling for robotics applications with reinforcement learning
    Tejer, Mateusz
    Szczepanski, Rafal
    Tarczewski, Tomasz
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127