Impulsive maneuver strategy for multi-agent orbital pursuit-evasion game under sparse rewards

被引:1
|
作者
Wang, Hongbo [1 ]
Zhang, Yao [1 ]
机构
[1] Beijing Inst Technol, Sch Aerosp Engn, Beijing 100081, Peoples R China
关键词
Orbital pursuit-evasion game; Impulsive thrust; Reinforcement learning; Hierarchical network; Hindsight experience;
D O I
10.1016/j.ast.2024.109618
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
To address the subjectivity of dense reward designs for the orbital pursuit-evasion game with multiple optimization objectives, this paper proposes the reinforcement learning method with a hierarchical network structure to guide game strategies under sparse rewards. Initially, to overcome the convergence challenges in the reinforcement learning training process under sparse rewards, a hierarchical network structure is proposed based on the hindsight experience replay. Subsequently, considering the strict constraints imposed by orbital dynamics on spacecraft state space, the reachable domain method is introduced to refine the subgoal space in the hierarchical network, further facilitating the achievement of subgoals. Finally, by adopting the centralized training-layered execution approach, a complete multi-agent reinforcement learning method with the hierarchical network structure is established, enabling networks at each level to learn effectively in parallel within sparse reward environments. Numerical simulations indicate that, under the single-agent reinforcement learning framework, the proposed method exhibits superior stability in the late training stage and enhances exploration efficiency in the early stage by 38.89% to 55.56% to the baseline method. Under the multi-agent reinforcement learning framework, as the relative distance decreases, the subgoals generated by the hierarchical network transition from long-term to short-term, aligning with human behavioral logic.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Apollonius partitions based pursuit-evasion strategies via multi-agent reinforcement
    Xue, Lei
    Wang, Qing
    Wu, Yongbao
    Yuan, Xin
    Liu, Jian
    NEUROCOMPUTING, 2025, 630
  • [22] Event-triggered multi-agent credit allocation pursuit-evasion algorithm
    Bo-Kun Zhang
    Bin Hu
    Ding-Xue Zhang
    Zhi-Hong Guan
    Xin-Ming Cheng
    Neural Processing Letters, 2023, 55 : 789 - 802
  • [23] Multi-Agent Cooperative Pursuit-Evasion Control Using Gene Expression Programming
    Ni, Yinjie
    Gao, Shuhua
    Huang, Sunan
    Xiang, Cheng
    Ren, Qinyuan
    Lee, Tong Heng
    IECON 2021 - 47TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2021,
  • [24] An escape strategy in orbital pursuit-evasion games with incomplete information
    LI ZhenYu
    ZHU Hai
    LUO YaZhong
    Science China(Technological Sciences), 2021, 64 (03) : 559 - 570
  • [25] An escape strategy in orbital pursuit-evasion games with incomplete information
    LI ZhenYu
    ZHU Hai
    LUO YaZhong
    Science China(Technological Sciences), 2021, (03) : 559 - 570
  • [26] StarCraft adversary-agent challenge for pursuit-evasion game
    Huang, Xun
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2023, 360 (15): : 10893 - 10916
  • [27] An escape strategy in orbital pursuit-evasion games with incomplete information
    Li, ZhenYu
    Zhu, Hai
    Luo, YaZhong
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2021, 64 (03) : 559 - 570
  • [28] An escape strategy in orbital pursuit-evasion games with incomplete information
    ZhenYu Li
    Hai Zhu
    YaZhong Luo
    Science China Technological Sciences, 2021, 64 : 559 - 570
  • [29] Escape Strategy Based on Apollonius Circles in the Pursuit-Evasion Game
    Huang, Yuting
    Luo, Yifan
    Nie, Yuhan
    Hou, Tianle
    Fu, Xiaowei
    PROCEEDINGS OF 2022 INTERNATIONAL CONFERENCE ON AUTONOMOUS UNMANNED SYSTEMS, ICAUS 2022, 2023, 1010 : 2143 - 2153
  • [30] Guidance strategy of motion camouflage for spacecraft pursuit-evasion game
    Jianqing LI
    Chaoyong LI
    Yonghe ZHANG
    Chinese Journal of Aeronautics, 2024, (03) : 312 - 319