Impulsive maneuver strategy for multi-agent orbital pursuit-evasion game under sparse rewards

被引:1
|
作者
Wang, Hongbo [1 ]
Zhang, Yao [1 ]
机构
[1] Beijing Inst Technol, Sch Aerosp Engn, Beijing 100081, Peoples R China
关键词
Orbital pursuit-evasion game; Impulsive thrust; Reinforcement learning; Hierarchical network; Hindsight experience;
D O I
10.1016/j.ast.2024.109618
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
To address the subjectivity of dense reward designs for the orbital pursuit-evasion game with multiple optimization objectives, this paper proposes the reinforcement learning method with a hierarchical network structure to guide game strategies under sparse rewards. Initially, to overcome the convergence challenges in the reinforcement learning training process under sparse rewards, a hierarchical network structure is proposed based on the hindsight experience replay. Subsequently, considering the strict constraints imposed by orbital dynamics on spacecraft state space, the reachable domain method is introduced to refine the subgoal space in the hierarchical network, further facilitating the achievement of subgoals. Finally, by adopting the centralized training-layered execution approach, a complete multi-agent reinforcement learning method with the hierarchical network structure is established, enabling networks at each level to learn effectively in parallel within sparse reward environments. Numerical simulations indicate that, under the single-agent reinforcement learning framework, the proposed method exhibits superior stability in the late training stage and enhances exploration efficiency in the early stage by 38.89% to 55.56% to the baseline method. Under the multi-agent reinforcement learning framework, as the relative distance decreases, the subgoals generated by the hierarchical network transition from long-term to short-term, aligning with human behavioral logic.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] A Simple and Effective Strategy of a Superior Evader in a Pursuit-Evasion Game
    Szots, Janos
    Harmati, Istvan
    2019 18TH EUROPEAN CONTROL CONFERENCE (ECC), 2019, : 3544 - 3549
  • [32] Approximate Optimal Strategy for Multiagent System Pursuit-Evasion Game
    Xu, Zhiqiang
    Yu, Dengxiu
    Liu, Yan-Jun
    Wang, Zhen
    IEEE SYSTEMS JOURNAL, 2024, 18 (03): : 1669 - 1680
  • [33] Guidance strategy of motion camouflage for spacecraft pursuit-evasion game
    Li, Jianqing
    Li, Chaoyong
    Zhang, Yonghe
    CHINESE JOURNAL OF AERONAUTICS, 2024, 37 (03) : 312 - 319
  • [34] Numerical Solution of the Three-Dimensional Orbital Pursuit-Evasion Game
    Pontani, Mauro
    Conway, Bruce A.
    JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2009, 32 (02) : 474 - 487
  • [35] Guidance strategy of motion camouflage for spacecraft pursuit-evasion game
    LI, Jianqing
    LI, Chaoyong
    ZHANG, Yonghe
    Chinese Journal of Aeronautics, 1600, 37 (03): : 312 - 319
  • [36] An Improved Approach towards Multi-Agent Pursuit-Evasion Game Decision-Making Using Deep Reinforcement Learning
    Wan, Kaifang
    Wu, Dingwei
    Zhai, Yiwei
    Li, Bo
    Gao, Xiaoguang
    Hu, Zijian
    ENTROPY, 2021, 23 (11)
  • [37] PRD-MADDPG: An efficient learning-based algorithm for orbital pursuit-evasion game with impulsive maneuvers
    Zhao, Liran
    Zhang, Yulin
    Dang, Zhaohui
    ADVANCES IN SPACE RESEARCH, 2023, 72 (02) : 211 - 230
  • [38] Learning-Based Metareasoning for Decision Making in Multi-Agent Pursuit-Evasion Games
    Namala, Prannoy
    Herrmann, Jeffrey W.
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS VI, 2024, 13051
  • [39] Saddle Point of Orbital Pursuit-Evasion Game Under J2-Perturbed Dynamics
    Li, Zhen-yu
    Zhu, Hai
    Yang, Zhen
    Luo, Ya-zhong
    JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2020, 43 (09) : 1733 - 1739
  • [40] Mean Field Game and Decentralized Intelligent Adaptive Pursuit Evasion Strategy for Massive Multi-Agent System under Uncertain Environment
    Zhou, Zejian
    Xu, Hao
    2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 5382 - 5387