Autonomous Maneuver Decision of UCAV Air Combat Based on Double Deep Q Network Algorithm and Stochastic Game Theory

被引：16

作者：

Cao, Yuan ^{[1
]}

Kou, Ying-Xin ^{[1
]}

Li, Zhan-Wu ^{[1
]}

Xu, An ^{[1
]}

机构：

[1] AF Engn Univ, Aviat Engn Coll, Xian 710038, Peoples R China

来源：

INTERNATIONAL JOURNAL OF AEROSPACE ENGINEERING | 2023年 / 2023卷

关键词：

UAV;

D O I：

10.1155/2023/3657814

中图分类号：

V [航空、航天];

学科分类号：

08 ; 0825 ;

摘要：

Aiming at the problem that unmanned combat aerial vehicle (UCAV) is difficult to quickly and accurately perceive situation information and make maneuvering decision autonomously in modern air combat, which is easily affected by complex factors, a maneuvering decision algorithm of UCAV combined with deep reinforcement learning and game theory is proposed in this paper. Firstly, through the UCAV dynamics model and maneuver library, a reasonable air combat situation assessment model and advantage reward function are established, and the sample data of situation assessment indicators are constructed using the structure entropy weight method. Secondly, the convolutional neural network (CNN) is used to process the high-dimensional continuous situation features of UCAV in air combat, eliminate the correlation and redundancy between situation features, and train the neural network to approximate the action-value function. Then, the double deep Q network (DDQN) algorithm in reinforcement learning (RL) is introduced to train the agent by the interaction with the environment and combined with Minimax algorithm in stochastic game theory to solve the optimal value function in each specific state, and the optimal maneuver decision of UCAV is obtained. Air combat simulation results show that UCAV can choose maneuvers autonomously under different situations and occupy a dominant position quickly by this method, which greatly improves the combat effectiveness of UCAV.

引用

页数：20

共 30 条

[21]

van Hasselt H, 2016, AAAI CONF ARTIF INTE, P2094

[22]

Wang Xuan, 2019, Ordnance Industry Automation, V38, P42, DOI 10.7690/bgzdh.2019.01.010

[23]

WATKINS CJCH, 1992, MACH LEARN, V8, P279, DOI 10.1007/BF00992698

[24] 3U: Joint Design of UAV-USV-UUV Networks for Cooperative Target Hunting [J].

Wei, Wei ;

Wang, Jingjing ;

Fang, Zhengru ;

Chen, Jianrui ;

Ren, Yong ;

Dong, Yuhan .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (03) :4085-4090

[25] Three-dimensional aircraft terrain-following via real-time optimal control [J].

Williams, Paul .

JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2007, 30 (04) :1201-1206

[26] Maneuver Decision of UAV in Short-Range Air Combat Based on Deep Reinforcement Learning [J].

Yang, Qiming ;

Zhang, Jiandong ;

Shi, Guoqing ;

Hu, Jinwen ;

Wu, Yong .

IEEE ACCESS, 2020, 8 :363-378

[27]

Yang QM, 2019, IEEE INT CONF CON AU, P37, DOI [10.1109/ICCA.2019.8899703, 10.1109/icca.2019.8899703]

[28]

Yang W, 2020, Acta Aeronautica et Astronautica Sinica, V41

[29]

ZHANG Y, 2017, Acta Aeronauticaet Astronautica Sinica, V38

[30]

Zhong Y., 2008, Acta Aeronautica ET Astronautica Sinica, V29, P114

← 1 2 3 →