A learning-based algorithm for turn-based orbital pursuit-evasion problem with reaction-time delay

被引:0
作者
Zhao, Liran [1 ]
Sun, Qinbo [1 ]
Xu, Sihan [1 ]
Dang, Zhaohui [1 ]
机构
[1] Northwestern Polytech Univ, Sch Astronaut, Xian 710072, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Reaction-time delay; Turn-based; Orbital pursuit-evasion game; Multi-agent reinforcement learning;
D O I
10.1016/j.engappai.2025.110231
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose an artificial intelligence-based methodology to investigate the impulsive orbital pursuit-evasion problem, taking into account the often-neglected factor of reaction time delay. This study defines this scenario as a Delayed Turn-based Orbital Pursuit-Evasion Game (DT-OPEG) and establishes relevant concepts and definitions based on orbital dynamics and game theory. Subsequently, considering several constraints inherent in real-world missions, including nonlinear orbital dynamics, maneuvering capabilities, fuel reserves, and mission duration, we formulate the problem modeling for DT-OPEG. To address the complexity of this problem, especially the challenge of incorporating action delays, we propose a Delayed Turn- based Multi-Agent Deep Deterministic Policy Gradient (DT-MADDPG) algorithm. The establishment process of this algorithm includes establishing a Turn-based Markov Decision Process (T-MDP) model with reaction-time delay, constructing a turn-based training framework, developing a network architecture based on MADDPG, and designing reward functions. Finally, simulation analyses are conducted for both two-dimensional and threedimensional DT-OPEG scenarios, confirming the effectiveness of the proposed algorithm and demonstrating the winning mechanism in this type of game.
引用
收藏
页数:18
相关论文
共 45 条
[1]   Pursuit-evasion games with impulsive dynamics [J].
Cruck, Eva ;
Quincampoix, Marc ;
Saint-Pierre, Patrick .
ADVANCES IN DYNAMIC GAME THEORY: NUMERICAL METHODS, ALGORITHMS, AND APPLICATIONS TO ECOLOGY AND ECONOMICS, 2007, 9 :223-+
[2]   Explainable anomaly detection in spacecraft telemetry [J].
Cuellar, Sara ;
Santos, Matilde ;
Alonso, Fernando ;
Fabregas, Ernesto ;
Farias, Gonzalo .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
[3]   Solutions of Tschauner-Hempel Equations [J].
Dang, Zhaohui .
JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2017, 40 (11) :2956-2960
[4]   Deep reinforcement learning based planning method in state space for lunar rovers [J].
Gao, Ai ;
Lu, Siyao ;
Xu, Rui ;
Li, Zhaoyu ;
Wang, Bang ;
Zhu, Shengying ;
Gao, Yuhui ;
Pan, Bo .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
[5]   Impulsive guidance of optimal pursuit with conical imaging zone for the evader [J].
Geng, Yuanzhuo ;
Yuan, Li ;
Guo, Yanning ;
Tang, Liang ;
Huang, Huang .
AEROSPACE SCIENCE AND TECHNOLOGY, 2023, 142
[6]   Sensitivity Methods Applied to Orbital Pursuit Evasion [J].
Hafer, William T. ;
Reed, Helen L. ;
Turner, James D. ;
Khanh Pham .
JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2015, 38 (06) :1118-U217
[7]   Orbital Blocking Game Near Earth-Moon L1 Libration Point [J].
Han, Hongyu ;
Dang, Zhaohui .
SPACE: SCIENCE & TECHNOLOGY, 2023, 3
[8]   Models and Strategies for J2-Perturbed Orbital Pursuit-Evasion Games [J].
Han, Hongyu ;
Dang, Zhaohui .
SPACE: SCIENCE & TECHNOLOGY, 2023, 3
[9]   Optimal delta-V-based strategies in orbital pursuit-evasion games [J].
Han, Hongyu ;
Dang, Zhaohui .
ADVANCES IN SPACE RESEARCH, 2023, 72 (02) :243-256
[10]  
Jagat A., 2014, AIAA AAS ASTR SPEC C, P4131