Multi-Agent Reinforcement Learning-Based Coordinated Dynamic Task Allocation for Heterogenous UAVs

被引：35

作者：

Liu, Da ^{[1
]}

Dou, Liqian ^{[1
]}

Zhang, Ruilong ^{[2
]}

Zhang, Xiuyun ^{[1
]}

Zong, Qun ^{[1
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

[2] Beijing Aerosp Automat Control Inst, Beijing 100143, Peoples R China

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2023年 / 72卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Coordinated dynamic task allocation; multi-agent reinforcement learning; heterogeneous unmanned aerial vehicles; TRAJECTORY DESIGN; SEARCH; GAME; INTERNET; SYSTEM;

D O I：

10.1109/TVT.2022.3228198

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The coordinated dynamic task allocation (CDTA) problem for heterogeneous unmanned aerial vehicles (UAVs) in the presence of environment uncertainty is studied in this paper. Dynamic task allocation mainly solves the problem of resource reallocation after new tasks appear, so that the multi-UAV systems can quickly respond to further information and objectives. In this paper, the CDTA strategy for heterogenous UAVs is proposed through proposer-responser mechanism and prioritized experience replay, in which the multi-agent reinforcement learning (MARL)-based coordinated network is constructed to propose request, and the Q-network is developed to approximate expected return to determine the responser whether to participate in the dynamic task. The CDTA algorithm considers the uncertainty of dynamic task and has a high scalability in different UAV groups, which can reduce the burden of online calculation and increase the speed of online operation effectively. The experiment proves that the priority experience replay speeds up the convergence of the algorithm, and the scalability of the algorithm is verified within 10-180 UAVs. Comparison simulations with the game theory-based and reinforcement learning-based methods are provided to show the effectiveness of the proposed algorithm.

引用

页码：4372 / 4383

页数：12

共 46 条

[1]

[Anonymous], 2013, PLAYING ATARI DEEP R

[2]

Buckman N, 2019, AIAA SCITECH 2019 FORUM

[3]

Chen C., 2020, Complexity, P1

[4] Autonomous Tracking Using a Swarm of UAVs: A Constrained Multi-Agent Reinforcement Learning Approach [J].

Chen, Yu-Jia ;

Chang, Deng-Kai ;

Zhang, Cheng .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (11) :13702-13717

[5] Consensus-Based Decentralized Auctions for Robust Task Allocation [J].

Choi, Han-Lim ;

Brunet, Luc ;

How, Jonathan P. .

IEEE TRANSACTIONS ON ROBOTICS, 2009, 25 (04) :912-926

[6] Game Combined Multi-Agent Reinforcement Learning Approach for UAV Assisted Offloading [J].

Gao, Ang ;

Wang, Qi ;

Liang, Wei ;

Ding, Zhiguo .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (12) :12888-12901

[7]

Goetzmann K.-S., 2011, PROC INT WORKSHOP AP, P89

[8] Cooperative Internet of UAVs: Distributed Trajectory Design by Multi-Agent Deep Reinforcement Learning [J].

Hu, Jingzhi ;

Zhang, Hongliang ;

Song, Lingyang ;

Schober, Robert ;

Poor, H. Vincent .

IEEE TRANSACTIONS ON COMMUNICATIONS, 2020, 68 (11) :6807-6821

[9] Reinforcement Learning for a Cellular Internet of UAVs: Protocol Design, Trajectory Control, and Resource Management [J].

Hu, Jingzhi ;

Zhang, Hongliang ;

Song, Lingyang ;

Han, Zhu ;

Poor, H. Vincent .

IEEE WIRELESS COMMUNICATIONS, 2020, 27 (01) :116-123

[10] Reinforcement Learning for Decentralized Trajectory Design in Cellular UAV Networks With Sense-and-Send Protocol [J].

Hu, Jingzhi ;

Zhang, Hongliang ;

Song, Lingyang .

IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (04) :6177-6189

← 1 2 3 4 5 →