A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat

被引:24
|
作者
Chai, Jiajun [1 ,2 ]
Chen, Wenzhang [1 ,2 ]
Zhu, Yuanheng [1 ,2 ]
Yao, Zong-Xin [3 ]
Zhao, Dongbin [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[3] Shenyang Aircraft Design & Res Inst, Dept Unmanned Aerial Vehicle, Shenyang 110035, Peoples R China
来源
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2023年 / 53卷 / 09期
基金
中国国家自然科学基金;
关键词
Aircraft; Aerospace control; 6-DOF; Task analysis; Nose; Missiles; Heuristic algorithms; 6-DOF unmanned combat air vehicle (UCAV); air combat; hierarchical structure; reinforcement learning (RL); self-play; LEVEL; GAME;
D O I
10.1109/TSMC.2023.3270444
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Unmanned combat air vehicle (UCAV) combat is a challenging scenario with high-dimensional continuous state and action space and highly nonlinear dynamics. In this article, we propose a general hierarchical framework to resolve the within-vision-range (WVR) air-to-air combat problem under six dimensions of degree (6-DOF) dynamics. The core idea is to divide the whole decision-making process into two loops and use reinforcement learning (RL) to solve them separately. The outer loop uses a combat policy to decide the macro command according to the current combat situation. Then the inner loop uses a control policy to answer the macro command by calculating the actual input signals for the aircraft. We design the Markov decision-making process for the control policy and the Markov game between two aircraft. We present a two-stage training mechanism. For the control policy, we design an effective reward function to accurately track various macro behaviors. For the combat policy, we present a fictitious self-play mechanism to improve the combat performance by combating against the historical combat policies. Experiment results show that the control policy can achieve better tracking performance than conventional methods. The fictitious self-play mechanism can learn competitive combat policy, which can achieve high winning rates against conventional methods.
引用
收藏
页码:5417 / 5429
页数:13
相关论文
共 50 条
  • [1] A hierarchical reinforcement learning method on Multi UCAV air combat
    Wang, Yabin
    Jiang, Tianshu
    Li, Youjiang
    2021 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, INFORMATION AND COMMUNICATION ENGINEERING, 2021, 11933
  • [2] Deep Reinforcement Learning-Based Air-to-Air Combat Maneuver Generation in a Realistic Environment
    Bae, Jung Ho
    Jung, Hoseong
    Kim, Seogbong
    Kim, Sungho
    Kim, Yong-Duk
    IEEE ACCESS, 2023, 11 : 26427 - 26440
  • [3] Reinforcement Learning for Multiaircraft Autonomous Air Combat in Multisensor UCAV Platform
    Kong, Weiren
    Zhou, Deyun
    Du, Yongjie
    Zhou, Ying
    Zhao, Yiyang
    IEEE SENSORS JOURNAL, 2023, 23 (18) : 20596 - 20606
  • [4] Mastering air combat game with deep reinforcement learning
    Zhu, Jingyu
    Kuang, Minchi
    Zhou, Wenqing
    Shi, Heng
    Zhu, Jihong
    Han, Xu
    DEFENCE TECHNOLOGY, 2024, 34 : 295 - 312
  • [5] Deep Reinforcement-Learning-Based Air-Combat-Maneuver Generation Framework
    Mei, Junru
    Li, Ge
    Huang, Hesong
    MATHEMATICS, 2024, 12 (19)
  • [6] Deep Reinforcement Learning-Based Decision Making for Six Degree of Freedom UCAV Close Range Air Combat
    Zhou, Pan
    Li, Ni
    Huang, Jiangtao
    Zhang, Sheng
    Zhou, Xiaoyu
    Liu, Gang
    2023 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY, VOL II, APISAT 2023, 2024, 1051 : 320 - 334
  • [7] Learning Continuous 3-DoF Air-to-Air Close-in Combat Strategy using Proximal Policy Optimization
    Li, Luntong
    Zhou, Zhiming
    Chai, Jiajun
    Liu, Zhen
    Zhu, Yuanheng
    Yi, Jianqiang
    2022 IEEE CONFERENCE ON GAMES, COG, 2022, : 616 - 619
  • [8] Hierarchical Multi-Agent Reinforcement Learning for Air Combat Maneuvering
    Selmonaj, Ardian
    Szehr, Oleg
    Del Rio, Giacomo
    Antonucci, Alessandro
    Schneider, Adrian
    Ruegsegger, Michael
    22ND IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA 2023, 2023, : 1031 - 1038
  • [9] Hierarchical Reinforcement Learning for Air Combat at DARPA's AlphaDogfight Trials
    Pope A.P.
    Ide J.S.
    Mićović D.
    Diaz H.
    Twedt J.C.
    Alcedo K.
    Walker T.T.
    Rosenbluth D.
    Ritholtz L.
    Javorsek D.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (06): : 1371 - 1385
  • [10] Learning and Fast Adaptation for Air Combat Decision with Improved Deep Meta-reinforcement Learning
    Zhang, Pin
    Dong, Wenhan
    Cai, Ming
    Li, Dunwang
    Zhang, Xin
    INTERNATIONAL JOURNAL OF AERONAUTICAL AND SPACE SCIENCES, 2024,