Terrain-Aware Risk-Assessment-Network-Aided Deep Reinforcement Learning for Quadrupedal Locomotion in Tough Terrain

被引:3
作者
Zhang, Hongyin [1 ,2 ]
Wang, Jilong [1 ,2 ,3 ]
Wu, Zhengqing [1 ,2 ]
Wang, Yinuo [1 ,2 ]
Wang, Donglin [1 ,2 ]
机构
[1] Westlake Univ, Sch Engn, Machine Intelligence Lab MiLAB, Hangzhou 310024, Peoples R China
[2] Westlake Inst Adv Study, Inst Adv Technol, Hangzhou 310024, Peoples R China
[3] Univ Calif Santa Cruz, Santa Cruz, CA 95064 USA
来源
2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2021年
关键词
D O I
10.1109/IROS51168.2021.9636519
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
When it comes to the control system of quadruped robots, deep reinforcement learning (DRL) is considered to be a promising solution. Despite years of development in this field, difficulties remain in guaranteeing the action stability of DRL-based quadruped robots' locomotion, especially in tough terrain. In this paper, a terrain-aware teacher-student controller integrating a risk assessment network (RAN) is proposed to alleviate this problem. During the training phase, the RAN can evaluate the risk level of historical observation or current state and further guide the update of the policy, thereby assisting the policy in selecting better actions and avoid risky ones. Furthermore, the real-time elevation map is transmitted to the controller as visual information, so that it can perceive the terrain to produce higher performance locomotion. With the aforementioned configuration, we enable a robot to traverse various challenging terrain in simulation and bound or trot stably in the real environment.
引用
收藏
页码:4538 / 4545
页数:8
相关论文
共 39 条
[1]  
Achiam Joshua, 2017, Constrained Policy Optimization
[2]  
Alshiekh M., 2017, Safe reinforcement learning via shielding
[3]   Spring-loaded inverted pendulum goes through two contraction-extension cycles during the single-support phase of walking [J].
Antoniak, Gabriel ;
Biswas, Tirthabir ;
Cortes, Nelson ;
Sikdar, Siddhartha ;
Chun, Chanwoo ;
Bhandawat, Vikas .
BIOLOGY OPEN, 2019, 8 (06)
[4]   Dynamic Locomotion Through Online Nonlinear Motion Optimization for Quadrupedal Robots [J].
Bellicoso, C. Dario ;
Jenelten, Fabian ;
Gehring, Christian ;
Hutter, Marco .
IEEE Robotics and Automation Letters, 2018, 3 (03) :2261-2268
[5]  
Berkenkamp Felix, 2017, SAFE MODEL BASED REI
[6]  
Chen Kuo, 2018, 2018 IEEE RSJ INT C
[7]  
Chua Kurtland, 2018, Deep reinforce- ment learning in a handful of trials using probabilistic dynamics models
[8]  
Coumans E., 2016, PYBULLET PYTHON MODU
[9]   Probabilistic Terrain Mapping for Mobile Robots With Uncertain Localization [J].
Fankhauser, Peter ;
Bloesch, Michael ;
Hutter, Marco .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (04) :3019-3026
[10]  
Finn C, 2017, PR MACH LEARN RES, V70