A hierarchical reinforcement learning method on Multi UCAV air combat

被引:2
作者
Wang, Yabin [1 ]
Jiang, Tianshu [1 ]
Li, Youjiang [1 ]
机构
[1] Informat Syst Engn Important Lab, Nanjing 210000, Jiangsu, Peoples R China
来源
2021 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, INFORMATION AND COMMUNICATION ENGINEERING | 2021年 / 11933卷
关键词
hierarchical reinforcement learning; ppo; air combat; UCAV; deep learning; neural network; human knowledge; BFM;
D O I
10.1117/12.2615268
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In the recent years, the unmanned combat aerial vehicle (UCAV) techniques is a hot topic of research. Many researches are studying how to use to fulfill missions and defend enemies based on simulation platforms. Different AI agents have been constructed to control virtual UCAVs to perform tasks on simulation platforms. Rule based AI heavily depends on human knowledge and lacks of flexibility. They cannot adapt to the changing environment. Reinforcement learning based AI has advantages over rule based AI as its depend less on human knowledge. In this paper a hierarchical reinforcement learning method is proposed on Multi-UCAV air combat based on simulation platform. The experiment results showed that the hierarchical approach can outperform state-of-the-art air combat method.
引用
收藏
页数:7
相关论文
共 16 条
[1]   AN APPROACH TO 3-DIMENSIONAL AIRCRAFT PURSUIT EVASION [J].
ARDEMA, MD ;
RAJAN, N .
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1987, 13 (1-3) :97-110
[2]  
Bonanni P., 1993, ART KILL
[3]  
Burgin G. H., 1975, NASA CR 2583, V2
[4]  
Ernest N., 2016, J DEFEN MANAGE, V6, DOI [10.4172/2167-0374.1000144, DOI 10.4172/2167-0374.1000144]
[5]   PURSUIT-EVASION BETWEEN 2 REALISTIC AIRCRAFT [J].
HILLBERG, C ;
JARMARK, B .
JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 1984, 7 (06) :690-694
[6]   An overview of cooperative and competitive multiagent learning [J].
Hoen, Pieter Jan 't ;
Tuyls, Karl ;
Panait, Liviu ;
Luke, Sean ;
La Poutre, J. A. .
LEARNING AND ADAPTION IN MULTI-AGENT SYSTEMS, 2006, 3898 :1-46
[7]   Multi-model cooperative task assignment and path planning of multiple UCAV formation [J].
Huang, Hanqiao ;
Zhuo, Tao .
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (01) :415-436
[8]   UAV Autonomous Aerial Combat Maneuver Strategy Generation with Observation Error Based on State-Adversarial Deep Deterministic Policy Gradient and Inverse Reinforcement Learning [J].
Kong, Weiren ;
Zhou, Deyun ;
Yang, Zhen ;
Zhao, Yiyang ;
Zhang, Kai .
ELECTRONICS, 2020, 9 (07) :1-24
[9]   Air-Combat Strategy Using Approximate Dynamic Programming [J].
McGrew, James S. ;
How, Jonathan P. ;
Williams, Brian ;
Roy, Nicholas .
JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2010, 33 (05) :1641-1654
[10]   An autonomous aerial combat framework for two-on-two engagements based on basic fighter maneuvers [J].
Shin, Heemin ;
Lee, Jaehyun ;
Kim, Hyungi ;
Shim, David Hyunchul .
AEROSPACE SCIENCE AND TECHNOLOGY, 2018, 72 :305-315