Autonomous Maneuver Decision of UCAV Air Combat Based on Double Deep Q Network Algorithm and Stochastic Game Theory

被引:16
作者
Cao, Yuan [1 ]
Kou, Ying-Xin [1 ]
Li, Zhan-Wu [1 ]
Xu, An [1 ]
机构
[1] AF Engn Univ, Aviat Engn Coll, Xian 710038, Peoples R China
关键词
UAV;
D O I
10.1155/2023/3657814
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Aiming at the problem that unmanned combat aerial vehicle (UCAV) is difficult to quickly and accurately perceive situation information and make maneuvering decision autonomously in modern air combat, which is easily affected by complex factors, a maneuvering decision algorithm of UCAV combined with deep reinforcement learning and game theory is proposed in this paper. Firstly, through the UCAV dynamics model and maneuver library, a reasonable air combat situation assessment model and advantage reward function are established, and the sample data of situation assessment indicators are constructed using the structure entropy weight method. Secondly, the convolutional neural network (CNN) is used to process the high-dimensional continuous situation features of UCAV in air combat, eliminate the correlation and redundancy between situation features, and train the neural network to approximate the action-value function. Then, the double deep Q network (DDQN) algorithm in reinforcement learning (RL) is introduced to train the agent by the interaction with the environment and combined with Minimax algorithm in stochastic game theory to solve the optimal value function in each specific state, and the optimal maneuver decision of UCAV is obtained. Air combat simulation results show that UCAV can choose maneuvers autonomously under different situations and occupy a dominant position quickly by this method, which greatly improves the combat effectiveness of UCAV.
引用
收藏
页数:20
相关论文
共 30 条
[1]  
Austin F., 1987, GUIDANCE NAVIGATION, P659
[2]   From discrete kinetic and stochastic game theory to modelling complex systems in applied sciences [J].
Bertotti, ML ;
Delitala, M .
MATHEMATICAL MODELS & METHODS IN APPLIED SCIENCES, 2004, 14 (07) :1061-1084
[3]   Target Threat Assessment in Air Combat Based on Improved Glowworm Swarm Optimization and ELM Neural Network [J].
Cao, Yuan ;
Kou, Ying-Xin ;
Xu, An ;
Xi, Zhi-Fei .
INTERNATIONAL JOURNAL OF AEROSPACE ENGINEERING, 2021, 2021
[4]  
[程启月 Cheng Qiyue], 2010, [系统工程理论与实践, Systems Engineering-Theory & Practice], V30, P1225
[5]   Stochastic Optimization-Aided Energy-Efficient Information Collection in Internet of Underwater Things Networks [J].
Fang, Zhengru ;
Wang, Jingjing ;
Du, Jun ;
Hou, Xiangwang ;
Ren, Yong ;
Han, Zhu .
IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (03) :1775-1789
[6]   Autonomous air combat maneuver decision using Bayesian inference and moving horizon optimization [J].
Huang Changqiang ;
Dong Kangsheng ;
Huang Hanqiao ;
Tang Shangqin ;
Zhang Zhuoran .
JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2018, 29 (01) :86-97
[7]   Air Combat Maneuver Decision Based on Reinforcement Genetic Algorithm [J].
Xie J. ;
Yang Q. ;
Dai S. ;
Wang W. ;
Zhang J. .
Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2020, 38 (06) :1330-1338
[8]  
Ke-qin Z., 2001, Aeronautical Computing Technol., V4, P50
[9]   Entropy of bi-capacities [J].
Kojadinovic, Ivan ;
Marichal, Jean-Luc .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 178 (01) :168-184
[10]   Deep learning [J].
LeCun, Yann ;
Bengio, Yoshua ;
Hinton, Geoffrey .
NATURE, 2015, 521 (7553) :436-444