Reinforcement learning with prior policy guidance for motion planning of dual-arm free-floating space robot

被引:17
作者
Cao, Yuxue [1 ]
Wang, Shengjie [2 ]
Zheng, Xiang [3 ]
Ma, Wenke [4 ]
Xie, Xinru [1 ]
Liu, Lei [1 ]
机构
[1] Beijing Inst Control Engn, Beijing, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[4] Qian Xuesen Lab Space Technol, Beijing, Peoples R China
关键词
Compendex;
D O I
10.1016/j.ast.2022.108098
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Reinforcement learning methods as a promising technique have achieved superior results in the motion planning of free-floating space robots. However, due to the increase in planning dimension and the intensification of system dynamics coupling, the motion planning of dual-arm free-floating space robots remains an open challenge. In particular, the current study cannot handle the task of capturing a noncooperative object due to the lack of the pose constraint of the end-effectors. To address the problem, we propose a novel algorithm, EfficientLPT, to facilitate RL-based methods to improve planning accuracy efficiently. Our core contributions are constructing a mixed policy with prior knowledge guidance and introducing II center dot Iloo to build a more reasonable reward function. Furthermore, our method successfully captures a rotating object with different spinning speeds.(c) 2023 Elsevier Masson SAS. All rights reserved.
引用
收藏
页数:12
相关论文
共 40 条
  • [1] Detumbling strategy based on friction control of dual-arm space robot for capturing tumbling target
    Chen, Gang
    Wang, Yuqi
    Wang, Yifan
    Liang, Ji
    Zhang, Long
    Pan, Guangtang
    [J]. CHINESE JOURNAL OF AERONAUTICS, 2020, 33 (03) : 1093 - 1106
  • [2] Cheng R, 2019, PR MACH LEARN RES, V97
  • [3] Desong Du, 2019, 2019 IEEE International Conference on Unmanned Systems (ICUS), P519, DOI 10.1109/ICUS48101.2019.8995991
  • [4] A novel model-free robust saturated reinforcement learning-based controller for quadrotors guaranteeing prescribed transient and steady state performance
    Elhaki, Omid
    Shojaei, Khoshnam
    [J]. AEROSPACE SCIENCE AND TECHNOLOGY, 2021, 119
  • [5] A review of space robotics technologies for on-orbit servicing
    Flores-Abad, Angel
    Ma, Ou
    Pham, Khanh
    Ulrich, Steve
    [J]. PROGRESS IN AEROSPACE SCIENCES, 2014, 68 : 1 - 26
  • [6] Ha H, 2020, Arxiv, DOI arXiv:2011.02608
  • [7] Haarnoja T, 2018, PR MACH LEARN RES, V80
  • [8] Explainable Deep Reinforcement Learning for UAV autonomous path planning
    He, Lei
    Aouf, Nabil
    Song, Bifeng
    [J]. AEROSPACE SCIENCE AND TECHNOLOGY, 2021, 118
  • [9] Hu XD, 2018, INT CONF SOFTW ENG, P1079, DOI 10.1109/ICSESS.2018.8663748
  • [10] Jiayuan Lu, 2020, 2020 8th International Conference on Control, Mechatronics and Automation (ICCMA), P23, DOI 10.1109/ICCMA51325.2020.9301571