End-to-End Reinforcement Learning for Torque Based Variable Height Hopping

被引:5
作者
Soni, Raghav [1 ]
Harnack, Daniel [2 ]
Isermann, Hauke [2 ]
Fushimi, Sotaro [3 ]
Kumar, Shivesh [2 ]
Kirchner, Frank [2 ]
机构
[1] Banaras Hindu Univ, Dept Elect Engn, Indian Inst Technol, Varanasi, Uttar Pradesh, India
[2] DFKI GmbH Robot Innovat Ctr, Bremen, Germany
[3] Kyoto Univ, Undergrad Course Program Mech & Syst Engn, Kyoto, Japan
来源
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2023年
关键词
D O I
10.1109/IROS55552.2023.10342187
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Legged locomotion is arguably the most suited and versatile mode to deal with natural or unstructured terrains. Intensive research into dynamic walking and running controllers has recently yielded great advances, both in the optimal control and reinforcement learning (RL) literature. Hopping is a challenging dynamic task involving a flight phase and has the potential to increase the traversability of legged robots. Model based control for hopping typically relies on accurate detection of different jump phases, such as lift-off or touch down, and using different controllers for each phase. In this paper, we present a end-to-end RL based torque controller that learns to implicitly detect the relevant jump phases, removing the need to provide manual heuristics for state detection. We also extend a method for simulation to reality transfer of the learned controller to contact rich dynamic tasks, resulting in successful deployment on the robot after training without parameter tuning.
引用
收藏
页码:7531 / 7538
页数:8
相关论文
共 41 条
[1]  
[Anonymous], 2016, Openai gym
[2]  
Bellegarda Guillaume, 2020, ARXIV201107089
[3]  
Bogdanovic M., 2021, ARXIV210706629
[4]  
Chen S., 2022, ARXIV220305194
[5]  
Coumans E., 2016, Pybullet, a python module for physics simulation for games, robotics and machine learning
[6]  
Di Carlo J., 2020, THESIS
[7]  
Haarnoja T., 2018, Learning to walk via deep reinforcement learning
[8]  
Haarnoja T, 2018, PR MACH LEARN RES, V80
[9]   Completely derandomized self-adaptation in evolution strategies [J].
Hansen, N ;
Ostermeier, A .
EVOLUTIONARY COMPUTATION, 2001, 9 (02) :159-195
[10]  
Harbick K, 2002, 2002 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS I-IV, PROCEEDINGS, P3998, DOI 10.1109/ROBOT.2002.1014359