High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning

被引:40
作者
Jin, Yongbin [1 ,2 ,3 ,4 ]
Liu, Xianwei [1 ]
Shao, Yecheng [1 ,4 ]
Wang, Hongtao [1 ,2 ,3 ,4 ]
Yang, Wei [1 ,2 ,3 ,4 ]
机构
[1] Zhejiang Univ, Ctr X Mech, Hangzhou, Peoples R China
[2] Hangzhou Global Sci & Technol Innovat Ctr, ZJU, Hangzhou, Peoples R China
[3] Zhejiang Univ, State Key Lab Fluid Power & Mechatron Syst, Hangzhou, Peoples R China
[4] Zhejiang Univ, Inst Appl Mech, Hangzhou, Peoples R China
关键词
ENTROPY STABILITY; DYNAMICS; DESIGN; ROBOT; MODEL;
D O I
10.1038/s42256-022-00576-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fast and stable locomotion of legged robots involves demanding and contradictory requirements, in particular rapid control frequency as well as an accurate dynamics model. Benefiting from universal approximation ability and offline optimization of neural networks, reinforcement learning has been used to solve various challenging problems in legged robot locomotion; however, the optimal control of quadruped robot requires optimizing multiple objectives such as keeping balance, improving efficiency, realizing periodic gait and following commands. These objectives cannot always be achieved simultaneously, especially at high speed. Here, we introduce an imitation-relaxation reinforcement learning (IRRL) method to optimize the objectives in stages. To bridge the gap between simulation and reality, we further introduce the concept of stochastic stability into system robustness analysis. The state space entropy decreasing rate is a quantitative metric and can sharply capture the occurrence of period-doubling bifurcation and possible chaos. By employing IRRL in training and the stochastic stability analysis, we are able to demonstrate a stable running speed of 5.0 m s(-1) for a MIT-MiniCheetah-like robot.
引用
收藏
页码:1198 / 1208
页数:11
相关论文
共 59 条
[51]  
Schulman J, 2017, Arxiv, DOI arXiv:1707.06347
[52]   Design Principles for Energy-Efficient Legged Locomotion and Implementation on the MIT Cheetah Robot [J].
Seok, Sangok ;
Wang, Albert ;
Chuah, Meng Yee ;
Hyun, Dong Jin ;
Lee, Jongwoo ;
Otten, David M. ;
Lang, Jeffrey H. ;
Kim, Sangbae .
IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2015, 20 (03) :1117-1129
[53]   Sim-to-Real Learning of All Common Bipedal Gaits via Periodic Reward Composition [J].
Siekmann, Jonah ;
Godse, Yesh ;
Fern, Alan ;
Hurst, Jonathan .
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, :7309-7315
[54]  
Siekmann J, 2021, ROBOT SCI SYS
[55]  
Siekmann J, 2020, ROBOTICS: SCIENCE AND SYSTEMS XVI
[56]   DeepGait: Planning and Control of Quadrupedal Gaits Using Deep Reinforcement Learning [J].
Tsounis, Vassilios ;
Alge, Mitja ;
Lee, Joonho ;
Farshidian, Farbod ;
Hutter, Marco .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) :3699-3706
[57]   Proprioceptive Actuator Design in the MIT Cheetah: Impact Mitigation and High-Bandwidth Physical Interaction for Dynamic Legged Robots [J].
Wensing, Patrick M. ;
Wang, Albert ;
Seok, Sangok ;
Otten, David ;
Lang, Jeffrey ;
Kim, Sangbae .
IEEE TRANSACTIONS ON ROBOTICS, 2017, 33 (03) :509-522
[58]   Multi-expert learning of adaptive legged locomotion [J].
Yang, Chuanyu ;
Yuan, Kai ;
Zhu, Qiuguo ;
Yu, Wanming ;
Li, Zhibin .
SCIENCE ROBOTICS, 2020, 5 (49)
[59]   The Boundaries of Walking Stability: Viability and Controllability of Simple Models [J].
Zaytsev, Petr ;
Wolfslag, Wouter ;
Ruina, Andy .
IEEE TRANSACTIONS ON ROBOTICS, 2018, 34 (02) :336-352