Autonomous Maneuver Decision of UCAV Air Combat Based on Double Deep Q Network Algorithm and Stochastic Game Theory

被引:14
作者
Cao, Yuan [1 ]
Kou, Ying-Xin [1 ]
Li, Zhan-Wu [1 ]
Xu, An [1 ]
机构
[1] AF Engn Univ, Aviat Engn Coll, Xian 710038, Peoples R China
关键词
UAV;
D O I
10.1155/2023/3657814
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Aiming at the problem that unmanned combat aerial vehicle (UCAV) is difficult to quickly and accurately perceive situation information and make maneuvering decision autonomously in modern air combat, which is easily affected by complex factors, a maneuvering decision algorithm of UCAV combined with deep reinforcement learning and game theory is proposed in this paper. Firstly, through the UCAV dynamics model and maneuver library, a reasonable air combat situation assessment model and advantage reward function are established, and the sample data of situation assessment indicators are constructed using the structure entropy weight method. Secondly, the convolutional neural network (CNN) is used to process the high-dimensional continuous situation features of UCAV in air combat, eliminate the correlation and redundancy between situation features, and train the neural network to approximate the action-value function. Then, the double deep Q network (DDQN) algorithm in reinforcement learning (RL) is introduced to train the agent by the interaction with the environment and combined with Minimax algorithm in stochastic game theory to solve the optimal value function in each specific state, and the optimal maneuver decision of UCAV is obtained. Air combat simulation results show that UCAV can choose maneuvers autonomously under different situations and occupy a dominant position quickly by this method, which greatly improves the combat effectiveness of UCAV.
引用
收藏
页数:20
相关论文
共 30 条
  • [1] Austin F., 1987, GUIDANCE NAVIGATION, P659
  • [2] From discrete kinetic and stochastic game theory to modelling complex systems in applied sciences
    Bertotti, ML
    Delitala, M
    [J]. MATHEMATICAL MODELS & METHODS IN APPLIED SCIENCES, 2004, 14 (07) : 1061 - 1084
  • [3] Target Threat Assessment in Air Combat Based on Improved Glowworm Swarm Optimization and ELM Neural Network
    Cao, Yuan
    Kou, Ying-Xin
    Xu, An
    Xi, Zhi-Fei
    [J]. INTERNATIONAL JOURNAL OF AEROSPACE ENGINEERING, 2021, 2021
  • [4] [程启月 Cheng Qiyue], 2010, [系统工程理论与实践, Systems Engineering-Theory & Practice], V30, P1225
  • [5] Stochastic Optimization-Aided Energy-Efficient Information Collection in Internet of Underwater Things Networks
    Fang, Zhengru
    Wang, Jingjing
    Du, Jun
    Hou, Xiangwang
    Ren, Yong
    Han, Zhu
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (03) : 1775 - 1789
  • [6] Autonomous air combat maneuver decision using Bayesian inference and moving horizon optimization
    Huang Changqiang
    Dong Kangsheng
    Huang Hanqiao
    Tang Shangqin
    Zhang Zhuoran
    [J]. JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2018, 29 (01) : 86 - 97
  • [7] Air Combat Maneuver Decision Based on Reinforcement Genetic Algorithm
    Xie J.
    Yang Q.
    Dai S.
    Wang W.
    Zhang J.
    [J]. Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2020, 38 (06): : 1330 - 1338
  • [8] Entropy of bi-capacities
    Kojadinovic, Ivan
    Marichal, Jean-Luc
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 178 (01) : 168 - 184
  • [9] Deep learning
    LeCun, Yann
    Bengio, Yoshua
    Hinton, Geoffrey
    [J]. NATURE, 2015, 521 (7553) : 436 - 444
  • [10] UAV Networks Against Multiple Maneuvering Smart Jamming With Knowledge-Based Reinforcement Learning
    Li, Zhiwei
    Lu, Yu
    Li, Xi
    Wang, Zengguang
    Qiao, Wenxin
    Liu, Yicen
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (15): : 12289 - 12310