A hierarchical reinforcement learning framework for multi-UAV combat using leader-follower strategy

被引：0

作者：

Pang, Jinhui ^{[1
]}

He, Jinglin ^{[1
]}

Mohamed, Noureldin Mohamed Abdelaal Ahmed ^{[1
]}

Lin, Changqing ^{[1
]}

Zhang, Zhihui ^{[1
]}

Hao, Xiaoshuai ^{[2
]}

机构：

[1] Beijing Inst Technol, Sch Comp, Beijing 100081, Peoples R China

[2] Beijing Acad Artificial Intelligence, Beijing 100084, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2025年 / 316卷

基金：

北京市自然科学基金;

关键词：

Air combat; Multi-agent; Coordination; Hierarchical structure; Reinforcement learning;

D O I：

10.1016/j.knosys.2025.113387

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-UAV air combat is a complex task involving multiple autonomous UAVs, an evolving field in both aerospace and artificial intelligence. This paper aims to enhance adversarial performance through collaborative strategies. Previous approaches predominantly discretize the action space into predefined actions, limiting UAV maneuverability and complex strategy implementation. Others simplify the problem to 1v1 combat, neglecting the cooperative dynamics among multiple UAVs. To address the high-dimensional challenges inherent in six-degree-of-freedom space and improve cooperation, we propose a hierarchical framework utilizing the Leader-Follower Multi-Agent Proximal Policy Optimization (LFMAPPO) strategy. Specifically, the framework is structured into three levels. The top level conducts a macro-level assessment of the environment and guides execution policy. The middle level determines the angle of the desired action. The bottom level generates precise action commands for the high-dimensional action space. Moreover, we optimize the state-value functions by assigning distinct roles with the leader-follower strategy to train the top-level policy, followers estimate the leader's utility, promoting effective cooperation among agents. Additionally, the incorporation of a target selector, aligned with the UAVs' posture, assesses the threat level of targets. Finally, simulation experiments validate the effectiveness of our proposed method.

引用

页数：10

共 41 条

[1] A flexible rule-based framework for pilot performance analysis in air combat simulation systems [J].

Arar, Omer Faruk ;

Ayan, Kursat .

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2013, 21 :2397-2415

[2] Autonomous Maneuver Decision of UCAV Air Combat Based on Double Deep Q Network Algorithm and Stochastic Game Theory [J].

Cao, Yuan ;

Kou, Ying-Xin ;

Li, Zhan-Wu ;

Xu, An .

INTERNATIONAL JOURNAL OF AEROSPACE ENGINEERING, 2023, 2023

[3] A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat [J].

Chai, Jiajun ;

Chen, Wenzhang ;

Zhu, Yuanheng ;

Yao, Zong-Xin ;

Zhao, Dongbin .

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (09) :5417-5429

[4] Formation flight of fixed-wing UAV swarms: A group-based hierarchical approach [J].

Chen, Hao ;

Wang, Xiangke ;

Shen, Lincheng ;

Cong, Yirui .

CHINESE JOURNAL OF AERONAUTICS, 2021, 34 (02) :504-515

[5] Hierarchical Reinforcement Learning Framework in Geographic Coordination for Air Combat Tactical Pursuit [J].

Chen, Ruihai ;

Li, Hao ;

Yan, Guanwei ;

Peng, Haojie ;

Zhang, Qian .

ENTROPY, 2023, 25 (10)

[6] Game-theoretic modeling and control of a military air operation [J].

Cruz, JB ;

Simaan, MA ;

Gacic, A ;

Jiang, HH ;

Letellier, B ;

Li, M ;

Liu, Y .

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2001, 37 (04) :1393-1405

[7] Optimal Trajectories for Aircraft Avoidance of Multiple Weapon Engagement Zones [J].

Dillon, Patrick M. ;

Zollars, Michael D. ;

Weintraub, Isaac E. ;

Von Moll, Alexander .

JOURNAL OF AEROSPACE INFORMATION SYSTEMS, 2023, 20 (08) :520-525

[8] Reinforcement Learning in Process Industries: Review and Perspective [J].

Dogru, Oguzhan ;

Xie, Junyao ;

Prakash, Om ;

Chiplunkar, Ranjith ;

Soesanto, Jansen ;

Chen, Hongtian ;

Velswamy, Kirubakaran ;

Ibrahim, Fadi ;

Huang, Biao .

IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (02) :283-300

[9] Air Combat Maneuver Decision Method Based on A3C Deep Reinforcement Learning [J].

Fan, Zihao ;

Xu, Yang ;

Kang, Yuhang ;

Luo, Delin .

MACHINES, 2022, 10 (11)

[10] Application of Deep Reinforcement Learning in Maneuver Planning of Beyond-Visual-Range Air Combat [J].

Hu, Dongyuan ;

Yang, Rennong ;

Zuo, Jialiang ;

Zhang, Ze ;

Wu, Jun ;

Wang, Ying .

IEEE ACCESS, 2021, 9 :32282-32297

← 1 2 3 4 5 →