Maneuvering decision-making of multi-UAV attack-defence confrontation based on PER-MATD3

被引:0
|
作者
Fu X. [1 ]
Xu Z. [2 ]
Zhu J. [1 ,3 ]
Wang N. [1 ]
机构
[1] School of Electronics and Information, Northwestern Polytechnical University, Xi’an
[2] Xi’an Institute of Applied Optics, Xi’an
[3] AVIC Shenyang Aircraft Design Research Institute, Shenyang
关键词
attack-defence confrontation; maneuvering decision-making; multi-agent reinforcement learning; multi-UAVs; PER-MATD3;
D O I
10.7527/S1000-6893.2022.27083
中图分类号
学科分类号
摘要
This paper explores multi-UAVs attack-defence confrontation maneuvering decision-making in a complex environment with random distribution of obstacles. A motion model and a radar detection model for both attack and defence sides are constructed. the Twin Delayed Deep Deterministic policy gradient(TD3)algorithm is extended to the multi-agent field to solve the problem of overestimation of the value function in the Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm. To improve the learning efficiency of the algorithm,a Prioritized Experience Replay Multi-Agent Twin Delayed Deep Deterministic policy gradient(PER-MATD3)algorithm is proposed based on the priority experience playback mechanism. The simulation experiments show that the method proposed in this paper has a good confrontation effect in multi-UAV attack-defence confrontation maneuvering decision making,and the advantages of the PER-MATD3 algorithm over other algorithms in terms of convergence speed and stability are verified by comparison. © 2023 AAAS Press of Chinese Society of Aeronautics and Astronautics. All rights reserved.
引用
收藏
相关论文
共 28 条
  • [1] SUN Z X, YANG S Q, Et al., A survey of air combat artificial intelligence[J], Acta Aeronautica et Astronautica Sinica, 42, 8, (2021)
  • [2] JIA Y N, TIAN S Y,, LI Q., Recent development of unmanned aerial vehicle swarms[J], Acta Aeronautica et Astronautica Sinica, 41, S1, pp. 4-14, (2020)
  • [3] LI Z Q, ZHOU D Y., UAV data links variable structure against attacks guidance law research[J], Journal of System Simulation, 20, 13, pp. 3507-3509, (2008)
  • [4] TIAN Y Z, ZHANG Y J., UAV path planning based on improved artificial potential field in dynamic environment [J], Journal of Wuhan University of Science and Technology, 40, 6, pp. 451-456, (2017)
  • [5] FANG B F,, PAN Q S,, HONG B R,, Et al., Research on high speed evader vs. multi lower speed pursuers in multi pursuit-evasion games[J], Information Technology Journal, 11, 8, pp. 989-997, (2012)
  • [6] XIE J., Differential game theory for multi UAV pursuit maneuver technology based on collaborative research [D], pp. 32-45, (2015)
  • [7] WEINTRAUB I, GARCIA E, PACHTER M., Optimal guidance strategy for the defense of a non - manoeuvrable target in 3-dimensions[J], IET Control Theory & Applications, 14, 11, pp. 1531-1538, (2020)
  • [8] ZHANG G F,, ZHOU K., Intelligent penetration for UAV based on improved artificial fish swarm algorithm (AFSA)[J], Control Engineering of China, 26, 5, pp. 922-926, (2019)
  • [9] HUO Z X, DAI S L, YUAN M X,, Et al., A reinforcement learning based multiple strategy framework for tracking a moving target[C]∥2020 IEEE/ASME International Conference on Advanced Intelligent Mechatronics(AIM), pp. 1292-1297, (2020)
  • [10] CHEN C, ZHENG D,, Et al., Cooperative attack-defense game of multiple UAVs with asymmetric maneuverability[J], Acta Aeronautica et Astronautica Sinica, 41, 12, (2020)