Cooperative Guidance Strategy for Active Spacecraft Protection from a Homing Interceptor via Deep Reinforcement Learning

被引：1

作者：

Ni, Weilin ^{[1
]}

Liu, Jiaqi ^{[2
]}

Li, Zhi ^{[1
]}

Liu, Peng ^{[2
]}

Liang, Haizhao ^{[1
]}

机构：

[1] Sun Yat sen Univ, Sch Aeronaut & Astronaut, Shenzheng 518107, Peoples R China

[2] Natl Key Lab Sci & Technol Test Phys & Numer Math, Beijing 100076, Peoples R China

来源：

MATHEMATICS | 2023年 / 11卷 / 19期

基金：

中国国家自然科学基金;

关键词：

cooperative guidance; reinforcement learning; active protection; guidance law; MISSILE; EVASION; TARGET; ALGORITHMS; PURSUIT; DEFENSE; LAWS;

D O I：

10.3390/math11194211

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

The cooperative active defense guidance problem for a spacecraft with active defense is investigated in this paper. An engagement between a spacecraft, an active defense vehicle, and an interceptor is considered, where the target spacecraft with active defense will attempt to evade the interceptor. Prior knowledge uncertainty and observation noise are taken into account simultaneously, which are vital for traditional guidance strategies such as the differential-game-based guidance method. In this set, we propose an intelligent cooperative active defense (ICAAI) guidance strategy based on deep reinforcement learning. ICAAI effectively coordinates defender and target maneuvers to achieve successful evasion with less prior knowledge and observational noise. Furthermore, we introduce an efficient and stable convergence (ESC) training approach employing reward shaping and curriculum learning to tackle the sparse reward problem in ICAAI training. Numerical experiments are included to demonstrate ICAAI's real-time performance, convergence, adaptiveness, and robustness through the learning process and Monte Carlo simulations. The learning process showcases improved convergence efficiency with ESC, while simulation results illustrate ICAAI's enhanced robustness and adaptiveness compared to optimal guidance laws.

引用

页数：25

共 47 条

[1] PLATEAU PHENOMENON IN GRADIENT DESCENT TRAINING OF RELU NETWORKS: EXPLANATION, QUANTIFICATION, AND AVOIDANCE
Ainsworth, Mark
Shin, Yeonjong
[J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2021, 43 (05) : A3438 - A3468
[2] COMPARISON OF OPTIMAL-CONTROL AND DIFFERENTIAL GAME INTERCEPT MISSILE GUIDANCE LAWS
ANDERSON, GM
[J]. JOURNAL OF GUIDANCE AND CONTROL, 1981, 4 (02): : 109 - 115
[3] Babaeizadeh M, 2017, Arxiv, DOI arXiv:1611.06256
[4] Bengio Y., 2009, P 26 ANN INT C MACH, P41, DOI DOI 10.1145/1553374.1553380
[5] DEFENDING A MOVING TARGET AGAINST MISSILE OR TORPEDO ATTACK
BOYELL, RL
[J]. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 1976, 12 (04) : 522 - 526
[6] Casas N, 2017, Arxiv, DOI [arXiv:1703.09035, DOI 10.48550/ARXIV.1703.09035]
[7] Strategies of Pursuit-Evasion Game Based on Improved Potential Field and Differential Game Theory for Mobile Robots
Dong, Jie
Zhang, Xu
Jia, Xuemei
[J]. PROCEEDINGS OF THE 2012 SECOND INTERNATIONAL CONFERENCE ON INSTRUMENTATION & MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2012), 2012, : 1452 - 1456
[8] Fan JQ, 2020, PR MACH LEARN RES, V120, P486
[9] Fujimoto S, 2018, PR MACH LEARN RES, V80
[10] Adaptive guidance and integrated navigation with reinforcement meta-learning
Gaudet, Brian
Linares, Richard
Furfaro, Roberto
[J]. ACTA ASTRONAUTICA, 2020, 169 : 180 - 190

← 1 2 3 4 5 →