Research on Multi-aircraft Cooperative Air Combat Method Based on Deep Reinforcement Learning

被引:0
作者
Shi W. [1 ]
Feng Y.-H. [1 ]
Cheng G.-Q. [1 ]
Huang H.-L. [1 ]
Huang J.-C. [1 ]
Liu Z. [1 ]
He W. [2 ,3 ]
机构
[1] College of Systems Engineering, National University of Defense Technology, Changsha
[2] Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing
[3] School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing
来源
Feng, Yang-He (fengyanghe@nudt.edu.cn) | 1610年 / Science Press卷 / 47期
基金
中国国家自然科学基金;
关键词
Deep reinforcement learning; Enhancement mechanism; Intelligent decision; Multi-aircraft cooperative air combat; Proximal policy optimization (PPO) algorithm;
D O I
10.16383/j.aas.c201059
中图分类号
学科分类号
摘要
Multi-aircraft cooperation is the key part of air combat, and how to deal with the complex cooperation relationship between multi-entities is the essential problem to be solved urgently. In order to solve the problem of intelligent decision-making in multi-aircraft cooperative air combat, a deep-reinforcement-learning-based multi-aircraft cooperative air combat decision framework (DRL-MACACDF) is proposed in this paper. Based on proximal policy optimization (PPO), four algorithm enhancement mechanisms are designed to improve the synergistic degree of agents in multi-aircraft cooperative confrontation scenarios. The feasibility and practicability of the method are verified by the simulation on the wargame platform, and the interpretable review analysis of the antagonistic process data is carried out, and the cross research direction of the combination of reinforcement learning and traditional wargame deduction is discussed. Copyright © 2021 Acta Automatica Sinica. All rights reserved.
引用
收藏
页码:1610 / 1623
页数:13
相关论文
共 40 条
[21]  
Han Tong, Cui Ming-Lang, Zhang Wei, Chen Guo-Ming, Wang Xiao-Fei, Multi-UCAV cooperative air combat maneuvering decision, Journal of Ordnance Equipment Engineering, 41, pp. 117-123, (2020)
[22]  
Ji Hui-Ming, Yu Min-Jian, Qiao Xin-Hang, Yang Hai-Yan, Zhang Shuai-Wen, Application of the improved BAS-TIMS algorithm in air combat maneuver decision, Journal of National University of Defense Technology, 42, pp. 123-133, (2020)
[23]  
Wang Xuan, Wang Wei-Jia, Song Ke-Pu, Wang Min-Wen, UAV air combat decision based on evolutionary expert system tree, Ordnance Industry Automation, 38, pp. 42-47, (2019)
[24]  
Zhou Tong-Le, Chen Mou, Zhu Rong-Gang, He Jian-Liang, Attack-defense satisficing decision-making of multi-UAVs cooperative multiple targets based on WPS Algorithm, Journal of Command and Control, 6, pp. 251-256, (2020)
[25]  
Zuo Jia-Liang, Yang Ren-Nong, Zhang Ying, Li Zhong-Lin, Wu Meng, Intelligent decision-making in air combat maneuvering based on heuristic reinforcement learning, Acta Aeronautica et Astronautica Sinica, 38, 10, pp. 217-230, (2017)
[26]  
Liu Shu-Lin, A new method of evaluation, Systems Engineering-Theory and Practice, 11, 4, pp. 63-66, (1991)
[27]  
Zhang H P, Huang C Q, Zhang Z R, Wang X F, Han B, Wei Z L, Et al., The trajectory generation of UCAV evading missiles based on neural networks, Journal of Physics Conference Series, 1486, 2020, (2020)
[28]  
Teng T H, Tan A H, Tan Y S, Yeo A., Self-organizing neural networks for learning air combat maneuvers, Proceeding of the 2012 International Joint Conference on Neural Networks, pp. 2858-2866, (2012)
[29]  
Meng Guang-Lei, Ma Xiao-Yu, Liu Xin, Xu Yi-Min, Situation assessment for unmanned aerial vehicles air combat based on hybrid dynamic Bayesian network, Command Control and Simulation, 39, (2017)
[30]  
Yang Ai-Wu, Li Zhan-Wu, Xu An, Xi Zhi-Fei, Chang Yi-Zhe, Threat level assessment of the air combat target based on weighted cloud dynamic Bayesian network, Flight Dynamics, 38, pp. 87-94, (2020)