Reinforcement learning with prior policy guidance for motion planning of dual-arm free-floating space robot

被引:27
作者
Cao, Yuxue [1 ]
Wang, Shengjie [2 ]
Zheng, Xiang [3 ]
Ma, Wenke [4 ]
Xie, Xinru [1 ]
Liu, Lei [1 ]
机构
[1] Beijing Inst Control Engn, Beijing, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[4] Qian Xuesen Lab Space Technol, Beijing, Peoples R China
关键词
Learning systems - Reinforcement learning - Robot programming;
D O I
10.1016/j.ast.2022.108098
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Reinforcement learning methods as a promising technique have achieved superior results in the motion planning of free-floating space robots. However, due to the increase in planning dimension and the intensification of system dynamics coupling, the motion planning of dual-arm free-floating space robots remains an open challenge. In particular, the current study cannot handle the task of capturing a noncooperative object due to the lack of the pose constraint of the end-effectors. To address the problem, we propose a novel algorithm, EfficientLPT, to facilitate RL-based methods to improve planning accuracy efficiently. Our core contributions are constructing a mixed policy with prior knowledge guidance and introducing II center dot Iloo to build a more reasonable reward function. Furthermore, our method successfully captures a rotating object with different spinning speeds.(c) 2023 Elsevier Masson SAS. All rights reserved.
引用
收藏
页数:12
相关论文
共 40 条
[1]   Detumbling strategy based on friction control of dual-arm space robot for capturing tumbling target [J].
Chen, Gang ;
Wang, Yuqi ;
Wang, Yifan ;
Liang, Ji ;
Zhang, Long ;
Pan, Guangtang .
CHINESE JOURNAL OF AERONAUTICS, 2020, 33 (03) :1093-1106
[2]  
Cheng R, 2019, PR MACH LEARN RES, V97
[3]  
Desong Du, 2019, 2019 IEEE International Conference on Unmanned Systems (ICUS), P519, DOI 10.1109/ICUS48101.2019.8995991
[4]   A novel model-free robust saturated reinforcement learning-based controller for quadrotors guaranteeing prescribed transient and steady state performance [J].
Elhaki, Omid ;
Shojaei, Khoshnam .
AEROSPACE SCIENCE AND TECHNOLOGY, 2021, 119
[5]   A review of space robotics technologies for on-orbit servicing [J].
Flores-Abad, Angel ;
Ma, Ou ;
Pham, Khanh ;
Ulrich, Steve .
PROGRESS IN AEROSPACE SCIENCES, 2014, 68 :1-26
[6]  
Ha H, 2020, Arxiv, DOI arXiv:2011.02608
[7]  
Haarnoja T, 2018, PR MACH LEARN RES, V80
[8]   Explainable Deep Reinforcement Learning for UAV autonomous path planning [J].
He, Lei ;
Aouf, Nabil ;
Song, Bifeng .
AEROSPACE SCIENCE AND TECHNOLOGY, 2021, 118
[9]  
Hu XD, 2018, INT CONF SOFTW ENG, P1079, DOI 10.1109/ICSESS.2018.8663748
[10]  
Jiayuan Lu, 2020, 2020 8th International Conference on Control, Mechatronics and Automation (ICCMA), P23, DOI 10.1109/ICCMA51325.2020.9301571