Robust trajectory design and guidance for far-range rendezvous using reinforcement learning with safety and observability considerations

被引:0
|
作者
Wijayatunga, Minduli Charithma [1 ]
Armellin, Roberto [1 ]
Holt, Harry [2 ]
机构
[1] Univ Auckland, Fac Engn, 20 Symonds St, Auckland 1010, New Zealand
[2] ESA, European Space Res & Technol Ctr, Adv Concepts Team, NL-2201 AZ Noordwijk, Netherlands
关键词
Angles-only navigation; Observability; Reinforcement learning; Proximal policy optimization; Far-range approach; PARTICLE SWARM OPTIMIZATION; PROXIMITY OPERATIONS; RELATIVE NAVIGATION; SPACECRAFT;
D O I
10.1016/j.ast.2025.109996
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Observability, safety, and robustness are critical for successful rendezvous and proximity operation (RPO) missions. The use of angles-only navigation (AON) for these missions is often seen as limited due to its inability to determine range, though it remains appealing for its low cost. This work utilizes the proximal policy optimization algorithm in reinforcement learning for the guidance of the far-range phase of an RPO mission, ensuring observability, safety, and minimal fuel consumption under AON. Trajectory planning for the mission is done via a nonlinear optimizer, which also considers safety and observability. The constraint satisfaction challenges during trajectory planning and guidance are alleviated through the problem formulation, which incorporates Lambert's method to guarantee that the target state is always reached. During the training of the reinforcement learning controller, a predefined set of grid points in the initial state distribution is used to evaluate the policies and select the best policy fairly. The nominal and reinforcement learning-guided trajectories are validated for observability and safety, and the guidance controller's performance is tested through Monte Carlo simulations. Results show that for a 6 h mission previously presented in the literature, in the presence of errors, the reinforcement learning controller consumes 22.31% less Delta v compared to the next-best guidance strategy explored while fully adhering to safety and observability constraints.
引用
收藏
页数:16
相关论文
共 24 条
  • [1] Trajectory safety for close range guidance phase of rendezvous and docking
    College of Aerospace and Material Engineering, National Univ. of Defense Technology, Changsha 410073, China
    Yuhang Xuebao/Journal of Astronautics, 2007, 28 (06): : 1554 - 1558
  • [2] Deep reinforcement learning for rendezvous guidance with enhanced angles-only observability
    Yuan, Hao
    Li, Dongxu
    AEROSPACE SCIENCE AND TECHNOLOGY, 2022, 129
  • [3] Collision probability based trajectory safety in close range guidance phase of rendezvous and docking
    College of Aerospace and Material Engineering, National University of Defense Technology, Changsha 410073, China
    Yuhang Xuebao, 2007, 3 (648-652):
  • [4] Reinforcement Learning for Robust Trajectory Design of Interplanetary Missions
    Zavoli, Alessandro
    Federici, Lorenzo
    JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2021, 44 (08) : 1440 - 1453
  • [5] Trajectory optimization of spacecraft autonomous far-distance rapid rendezvous based on deep reinforcement learning
    Di, Peng
    Yao, Ye
    Lin, Zheng
    Yin, Zengshan
    ADVANCES IN SPACE RESEARCH, 2025, 75 (01) : 790 - 806
  • [6] Quadcopter Guidance Law Design using Deep Reinforcement Learning
    Aydinli, Sevket Utku
    Kutay, Ali Turker
    2023 10TH INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN AIR AND SPACE TECHNOLOGIES, RAST, 2023,
  • [7] Robust interplanetary trajectory design under multiple uncertainties via meta-reinforcement learning
    Federici, Lorenzo
    Zavoli, Alessandro
    ACTA ASTRONAUTICA, 2024, 214 : 147 - 158
  • [8] Distributed Trajectory Design for Cooperative Internet of UAVs Using Deep Reinforcement Learning
    Hu, Jingzhi
    Zhang, Hongliang
    Bian, Kaigui
    Song, Lingyang
    Han, Zhu
    2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
  • [9] Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
    Bogdanovic, Miroslav
    Khadiv, Majid
    Righetti, Ludovic
    FRONTIERS IN ROBOTICS AND AI, 2022, 9
  • [10] WPT-enabled UAV Trajectory Design for Healthcare Delivery Using Reinforcement Learning
    Merabet, Adel
    Lakas, Abderrahmane
    Belkacem, Abdelkader Nasreddine
    2022 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2022, : 271 - 277