Close air combat maneuver decision based on deep stochastic game

被引:7
作者
Ma W. [1 ]
Li H. [1 ,2 ]
Wang Z. [1 ]
Huang Z. [1 ]
Wu Z. [2 ]
Chen X. [3 ]
机构
[1] College of Computer Science, Sichuan University, Chengdu
[2] National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu
[3] College of Command and Control Engineering, Army Engineering University, Nanjing
来源
Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics | 2021年 / 43卷 / 02期
关键词
Air combat strategy; Deep reinforcement learning; Game theory; Stochastic game;
D O I
10.12305/j.issn.1001-506X.2021.02.19
中图分类号
学科分类号
摘要
In order to solve the problem of complex combat information and difficult to quickly and accurately perceive situation and make decision in air combat, an algorithm combining game theory and deep reinforcement learning is proposed. Firstly, according to the typical one-to-one air combat process and the standard of random game, a two machine multi-state game model under the condition of red and blue confrontation in close air combat is constructed. Secondly, deep Q network (DQN) is used to deal with the continuous infinite state space of fighter. Then, the Minimax algorithm is used to construct a linear programming to solve the optimal value function of the stage game in each specific state, and the network approximation value function is trained. Finally, the optimal maneuver strategy is obtained according to the output of the network after training. The simulation results show that the algorithm has good adaptability and intelligence for the air combat. It can effectively select the favorable maneuver action and occupy the dominant position according to the air combat opponent's action strategy. © 2021, Editorial Office of Systems Engineering and Electronics. All right reserved.
引用
收藏
页码:443 / 451
页数:8
相关论文
共 32 条
[1]  
KONG J T., Research of belief-rule-based reasoning technology for learning air combat maneuvers, (2015)
[2]  
HUANG C Q, DONG K S, HUANG H Q, Et al., Autonomous air combat maneuver decision using Bayesian inference and moving horizon optimization, Journal of Systems Engineering and Electronics, 29, 1, pp. 90-101, (2018)
[3]  
VIRTANEN K, KARELAHTI J, RAIVIO T., Modeling air combat by a moving horizon influence diagram game, Journal of Guidance Control & Dynamics, 29, 5, pp. 1080-1091, (2006)
[4]  
CHAPPELL A R., Knowledge-based reasoning in the Paladin tactical decision generation system, Proc. of the 11th Digital Avionics Systems Conference, pp. 155-160, (1992)
[5]  
HORIE K, CONWAY B A., Optimal fighter pursuit-evasion maneuvers found via two-sided optimization, Journal of Guidance, Control, and Dynamics, 29, 1, pp. 105-112, (2006)
[6]  
SU M C, LAI S C, LIN S C, Et al., A new approach to multi-aircraft air combat assignments, Swarm and Evolutionary Computation, 6, pp. 39-46, (2012)
[7]  
DONG X J, YU M J., Study on countermeasure of free air combat command and guide based on game theory, Aeronautical Computing Technique, 47, 2, pp. 80-84, (2017)
[8]  
MEI D, LIU J T, GAO L., Maneuver decision of air combat based on approximate dynamic programming and zero-sum game, Ordnance Industry Automation, 36, 3, pp. 35-39, (2017)
[9]  
LUO Y Q, MENG G L., Research on UAV maneuver decision-making method based on markov network, Journal of System Simulation, 29, S1, pp. 110-116, (2017)
[10]  
WANG X, WANG W J, SONG K P, Et al., UAV air combat decision based on evolutionary expert system tree, Ordnance Industry Automation, 38, 1, pp. 48-53, (2019)