Memory-extraction-based DRL cooperative guidance against the maneuvering target protected by interceptors

被引:0
作者
Sun, Hao [1 ]
Yan, Shi [1 ]
Liang, Yan [1 ]
Ma, Chaoxiong [1 ]
Zhang, Tao [2 ]
Pei, Liuyu [1 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Xian 710072, Shaanxi, Peoples R China
[2] Air Force Engn Univ, Sch Air Traff Control & Nav, Xian 710051, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Missiles; Cooperative guidance; Spatio-temporal memory extraction; Multi-order Markov decision process; Deep reinforcement learning; Maneuvering target;
D O I
10.1016/j.ast.2024.109575
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
This paper presents an open and interesting issue for missiles, i.e., achieving collaborative parameters constrained cooperative guidance, despite the interference of pursing interceptors (INTs) and the maneuvering target, by the fact that the target-missile-interceptor (TMI) engagement induces their complex and time-varying relationships. The Memory-Extraction-based Soft-Actor-Critic (ME-SAC) approach is proposed, which enhances the collaborative performance of missiles by implicitly extracting coupling motion characteristics among TMI from historical state, achieving the joint optimization of situation awareness and strategy. Firstly, the cooperative guidance task is formulated as a multi-order Markov decision process (MOMDP) to better represent the dynamic evolution of engagement, and a memory-extraction process is introduced to alleviate the curse of dimensionality. Secondly, a memory-decision-oriented maximum entropy framework combined with memory update modules is designed for enhancing strategy search ability. Then, a domain-knowledge-based pre-training is implemented to improve convergence speed. Finally, in simulation evaluation with various scenarios, the proposed ME-SAC shows more promising than the typical DRL-based and model-based algorithms in task success rate, learning efficiency, and adaptability.
引用
收藏
页数:18
相关论文
共 8 条
  • [1] Cayci S, 2024, Arxiv, DOI arXiv:2402.12241
  • [2] Haarnoja T, 2019, Arxiv, DOI [arXiv:1812.05905, DOI 10.48550/ARXIV.1812.05905]
  • [3] Cerebellar Model Articulation Neural Network-Based Distributed Fault Tolerant Tracking Control With Obstacle Avoidance for Fixed-Wing UAVs
    Qian, Moshu
    Wu, Zhu
    Jiang, Bin
    [J]. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2023, 59 (05) : 6841 - 6852
  • [4] Memory-Based Deep Reinforcement Learning for Obstacle Avoidance in UAV With Limited Environment Knowledge
    Singla, Abhik
    Padakandla, Sindhu
    Bhatnagar, Shalabh
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (01) : 107 - 118
  • [5] Influence-aware memory architectures for deep reinforcement learning in POMDPs
    Suau, Miguel
    He, Jinke
    Congeduti, Elena
    Starre, Rolf A. N.
    Czechowski, Aleksander
    Oliehoek, Frans A.
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022,
  • [6] Formation Control With Collision Avoidance Through Deep Reinforcement Learning Using Model-Guided Demonstration
    Sui, Zezhi
    Pu, Zhiqiang
    Yi, Jianqiang
    Wu, Shiguang
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) : 2358 - 2372
  • [7] Deep Reinforcement Learning Based Link Adaptation Technique for LTE/NR Systems
    Ye, Xiaowen
    Yu, Yiding
    Fu, Liqun
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (06) : 7364 - 7379
  • [8] Representation Learning and Reinforcement Learning for Dynamic Complex Motion Planning System
    Zhou, Chengmin
    Huang, Bingding
    Franti, Pasi
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) : 11049 - 11063