Impulsive maneuver strategy for multi-agent orbital pursuit-evasion game under sparse rewards

被引：1

作者：

Wang, Hongbo ^{[1
]}

Zhang, Yao ^{[1
]}

机构：

[1] Beijing Inst Technol, Sch Aerosp Engn, Beijing 100081, Peoples R China

来源：

AEROSPACE SCIENCE AND TECHNOLOGY | 2024年 / 155卷

关键词：

Orbital pursuit-evasion game; Impulsive thrust; Reinforcement learning; Hierarchical network; Hindsight experience;

D O I：

10.1016/j.ast.2024.109618

中图分类号：

V [航空、航天];

学科分类号：

08 ; 0825 ;

摘要：

To address the subjectivity of dense reward designs for the orbital pursuit-evasion game with multiple optimization objectives, this paper proposes the reinforcement learning method with a hierarchical network structure to guide game strategies under sparse rewards. Initially, to overcome the convergence challenges in the reinforcement learning training process under sparse rewards, a hierarchical network structure is proposed based on the hindsight experience replay. Subsequently, considering the strict constraints imposed by orbital dynamics on spacecraft state space, the reachable domain method is introduced to refine the subgoal space in the hierarchical network, further facilitating the achievement of subgoals. Finally, by adopting the centralized training-layered execution approach, a complete multi-agent reinforcement learning method with the hierarchical network structure is established, enabling networks at each level to learn effectively in parallel within sparse reward environments. Numerical simulations indicate that, under the single-agent reinforcement learning framework, the proposed method exhibits superior stability in the late training stage and enhances exploration efficiency in the early stage by 38.89% to 55.56% to the baseline method. Under the multi-agent reinforcement learning framework, as the relative distance decreases, the subgoals generated by the hierarchical network transition from long-term to short-term, aligning with human behavioral logic.

引用

页数：16

共 50 条

[41] Pursuit-Evasion Games with Information Uncertainties for Elusive Orbital Maneuver and Space Object Tracking
Shen, Dan
Jia, Bin
Chen, Genshe
Blasch, Erik
Khanh Pham
SENSORS AND SYSTEMS FOR SPACE APPLICATIONS VIII, 2015, 9469
[42] Parallel multi-speed Pursuit-Evasion Game algorithms
dos Santos, Renato F.
Ramachandran, Ragesh K.
Vieira, Marcos A. M.
Sukhatme, Gaurav S.
ROBOTICS AND AUTONOMOUS SYSTEMS, 2023, 163
[43] Compensation Control Strategy for Orbital Pursuit-Evasion Problem with Imperfect Information
Zhou, Junfeng
Zhao, Lin
Li, Hui
Cheng, Jianhua
Wang, Shuo
APPLIED SCIENCES-BASEL, 2021, 11 (04): : 1 - 23
[44] Pursuit and evasion game between UVAs based on multi-agent reinforcement learning
Xu, Guangyan
Zhao, Yang
Liu, Hao
2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 1261 - 1266
[45] Equilibrium Strategy of the Pursuit-Evasion Game in Three-Dimensional Space
Chen, Nuo
Li, Linjing
Mao, Wenji
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (02) : 446 - 458
[46] Equilibrium Strategy of the Pursuit-Evasion Game in Three-Dimensional Space
Nuo Chen
Linjing Li
Wenji Mao
IEEE/CAA Journal of Automatica Sinica, 2024, 11 (02) : 446 - 458
[47] Solution of a Pursuit-Evasion Game Using a Near-Optimal Strategy
Carr, Ryan W.
Cobb, Richard G.
Pachter, Meir
Pierce, Scott
JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2018, 41 (04) : 841 - 850
[48] Adaptive Optimal Control via Q-Learning for Multi-Agent Pursuit-Evasion Games
Dong, Xu
Zhang, Huaguang
Ming, Zhongyang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (06) : 3056 - 3060
[49] An optimal guidance method for free-time orbital pursuit-evasion game
Zhang, Chengming
Zhu, Yanwei
Yang, Leping
Zeng, Xin
JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2022, 33 (06) : 1294 - 1308
[50] Computing the Pursuing Control in Proximate Orbital Pursuit-Evasion Game by Polynomial Approximation
Shi, Mingming
Li, Bin
Zhou, Bin
Zhang, Shuangna
Cao, Lu
Xu, Xueyong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2025,

← 1 2 3 4 5 →