Multi-UAV Cooperative Search Based on Reinforcement Learning With a Digital Twin Driven Training Framework

被引：48

作者：

Shen, Gaoqing ^{[1
]}

Lei, Lei ^{[1
]}

Zhang, Xinting ^{[1
]}

Li, Zhilin ^{[1
]}

Cai, Shengsuo ^{[1
]}

Zhang, Lijuan ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Elect & Informat Engn, Nanjing 211106, Peoples R China

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2023年 / 72卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Cooperative target search; digital twin; multi-agent deep reinforcement learning; unmanned aerial vehicles; TARGET SEARCH; FUSION;

D O I：

10.1109/TVT.2023.3245120

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper considers the cooperative search for stationary targets by multiple unmanned aerial vehicles (UAVs) with limited sensing range and communication ability in a dynamic threatening environment. The main purpose is to use multiple UAVs to find more unknown targets as soon as possible, increase the coverage rate of the mission area, and more importantly, guide UAVs away from threats. However, traditional search methods are mostly unscalable and perform poorly in dynamic environments. A new multi-agent deep reinforcement learning (MADRL) method, DNQMIX, is proposed in this study to solve the multi-UAV cooperative target search (MCTS) problem. The reward function is also newly designed for the MCTS problem to guide UAVs to explore and exploit the environment information more efficiently. Moreover, this paper proposes a digital twin (DT) driven training framework "centralized training, decentralized execution, and continuous evolution" (CTDECE). It can facilitate the continuous evolution of MADRL models and solve the tradeoff between training speed and environment fidelity when MADRL is applied to real-world multi-UAV systems. Simulation results show that DNQMIX outperforms state-of-art methods in terms of search rate and coverage rate.

引用

页码：8354 / 8368

页数：15

共 42 条

[1]

Ablavsky V., 2000, P AIAA GUID NAV CONT

[2] Deep Reinforcement Learning A brief survey [J].

Arulkumaran, Kai ;

Deisenroth, Marc Peter ;

Brundage, Miles ;

Bharath, Anil Anthony .

IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) :26-38

[3]

Bertuccelli LF, 2005, IEEE DECIS CONTR P, P5680

[4] Target Search on Road Networks With Range-Constrained UAVs and Ground-Based Mobile Recharging Vehicles [J].

Booth, Kyle E. C. ;

Piacentini, Chiara ;

Bernardini, Sara ;

Beck, J. Christopher .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (04) :6702-6709

[5] Coverage for robotics - A survey of recent results [J].

Choset, H .

ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2001, 31 (1-4) :113-126

[6] Analysis of Search Decision Making Using Probabilistic Search Strategies [J].

Chung, Timothy H. ;

Burdick, Joel W. .

IEEE TRANSACTIONS ON ROBOTICS, 2012, 28 (01) :132-144

[7] Dynamic Discrete Pigeon-Inspired Optimization for Multi-UAV Cooperative Search-Attack Mission Planning [J].

Duan, Haibin ;

Zhao, Jianxia ;

Deng, Yimin ;

Shi, Yuhui ;

Ding, Xilun .

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2021, 57 (01) :706-720

[8]

Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974

[9]

Fortunato M., 2018, ARXIV170610295

[10] Cooperative Search Method for Multiple UAVs Based on Deep Reinforcement Learning [J].

Gao, Mingsheng ;

Zhang, Xiaoxuan .

SENSORS, 2022, 22 (18)

← 1 2 3 4 5 →