Close air combat maneuver decision based on deep stochastic game

被引：7

作者：

Ma W. ^{[1
]}

Li H. ^{[1
,2
]}

Wang Z. ^{[1
]}

Huang Z. ^{[1
]}

Wu Z. ^{[2
]}

Chen X. ^{[3
]}

机构：

[1] College of Computer Science, Sichuan University, Chengdu

[2] National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu

[3] College of Command and Control Engineering, Army Engineering University, Nanjing

来源：

Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics | 2021年 / 43卷 / 02期

关键词：

Air combat strategy; Deep reinforcement learning; Game theory; Stochastic game;

D O I：

10.12305/j.issn.1001-506X.2021.02.19

中图分类号：

学科分类号：

摘要：

In order to solve the problem of complex combat information and difficult to quickly and accurately perceive situation and make decision in air combat, an algorithm combining game theory and deep reinforcement learning is proposed. Firstly, according to the typical one-to-one air combat process and the standard of random game, a two machine multi-state game model under the condition of red and blue confrontation in close air combat is constructed. Secondly, deep Q network (DQN) is used to deal with the continuous infinite state space of fighter. Then, the Minimax algorithm is used to construct a linear programming to solve the optimal value function of the stage game in each specific state, and the network approximation value function is trained. Finally, the optimal maneuver strategy is obtained according to the output of the network after training. The simulation results show that the algorithm has good adaptability and intelligence for the air combat. It can effectively select the favorable maneuver action and occupy the dominant position according to the air combat opponent's action strategy. © 2021, Editorial Office of Systems Engineering and Electronics. All right reserved.

引用

页码：443 / 451

页数：8

共 32 条

[11]

DENG K, PENG X Q, ZHOU D Y., Study on air combat decision method of UAV based on matrix game and genetic algorithm, Fire Control & Command Control, 44, 12, pp. 61-66, (2019)

[12]

ZHOU G X, ZHOU F., A preliminary study of the alpha of the US army's Artificial intelligence air combat system, Proc. of the 6th China Command and Control Conference, pp. 66-70, (2018)

[13]

SILVER D, SCHRITTWIESER J, SIMONYAN K, Et al., Mastering the game of Go without human knowledge, Nature, 550, 7676, pp. 354-359, (2017)

[14]

VINCENT F L, PETER H, RIASHAT I, Et al., An introduction to deep reinforcement learning, Foundations and Trends in Machine Learning, 11, 3, pp. 219-354, (2018)

[15]

MA Y F, MA X L, SONG X., A case study on air combat decision using approximated dynamic programming, Mathematical Problems in Engineering, 4, pp. 183-193, (2014)

[16]

LITTMAN M L., Markov games as a framework for multi-agent reinforcement learning, Proc. of the 11th International Conference on Machine Learning, pp. 157-163, (1994)

[17]

CORCHON L C, MARINI M A., Handbook of game theory and industrial organization, volume I, theory, (2018)

[18]

PAVLIDIS N G, PARSOPOULOS K E, VRAHATIS M N., Computing Nash equilibria through computational intelligence methods, Journal of Computational & Applied Mathematics, 175, 1, pp. 113-136, (2005)

[19]

BARDHAN R., An SDRE based differential game approach for maneuvering target interception, Proc. of the AIAA Guidance, Navigation and Control Conference, pp. 704-711, (2015)

[20]

OYLER D W, KABAMBA P T, GIRARDA R., Pursuit-evasion games in the presence of obstacles, Automatica, 65, c, pp. 1-11, (2016)

← 1 2 3 4 →