Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement learning

被引：50

作者：

Zhou, Wenhong ^{[1
]}

Liu, Zhihong ^{[1
]}

Li, Jie ^{[1
]}

Xu, Xin ^{[1
]}

Shen, Lincheng ^{[1
]}

机构：

[1] Natl Univ Def Technol, Coll Intelligence Sci & Technol, Changsha 410073, Peoples R China

来源：

NEUROCOMPUTING | 2021年 / 466卷

基金：

中国国家自然科学基金;

关键词：

UAV swarms; Multi-target tracking; Multi-agent reinforcement learning; Scalability; Feature representation; ROBOTS; ALGORITHMS; SEARCH;

D O I：

10.1016/j.neucom.2021.09.044

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, deep reinforcement learning (DRL) has proved its great potential in multi-agent cooper-ation. However, how to apply DRL to multi-target tracking (MTT) problem for unmanned aerial vehicle (UAV) swarms is challenging: 1) the scale of UAVs may be large, but the existing multi-agent reinforce-ment learning (MARL) methods that rely on global or joint information of all agents suffer from the dimensionality curse; 2) the dimension of each UAV's received information is variable, which is incom-patible with the neural networks with fixed input dimensions; 3) the UAVs are homogeneous and inter-changeable that each UAV's policy should be irrelevant to the permutation of its received information. To this end, we propose a DRL method for UAV swarms to solve the MTT problem. Firstly, a decentralized swarm-oriented Markov Decision Process (MDP) model is presented for UAV swarms, which involves each UAV's local communication and partial observation. Secondly, to achieve better scalability, a car-togram feature representation (FR) is proposed to integrate the variable-dimensional information set into a fixed-shape input variable, and the cartogram FR can also maintain the permutation irrelevance to the information. Then, the double deep Q-learning network with dueling architecture is adapted to the MTT problem, and the experience-sharing training mechanism is adopted to learn the shared cooperative pol-icy for UAV swarms. Extensive experiments are provided and the results show that our method can suc-cessfully learn a cooperative tracking policy for UAV swarms and outperforms the baseline method in the tracking ratio and scalability. (c) 2021 Elsevier B.V. All rights reserved.

引用

页码：285 / 297

页数：13

共 50 条

[31] COLREG-Compliant Collision Avoidance for Unmanned Surface Vehicle Using Deep Reinforcement Learning [J].

Meyer, Eivind ;

Heiberg, Amalie ;

Rasheed, Adil ;

San, Omer .

IEEE ACCESS, 2020, 8 :165344-165364

[32] Learning Optimal Parameters for Multi-target Tracking with Contextual Interactions [J].

Wang, Shaofei ;

Fowlkes, Charless C. .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2017, 122 (03) :484-501

[33] Multi-target Tracking using Mixed Spatio-Temporal Features Learning Model [J].

Ge Yinghui ;

Yu Jianjun .

2009 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND LOGISTICS ( ICAL 2009), VOLS 1-3, 2009, :799-+

[34] IMPROVED DATA ASSOCIATION ALGORITHM FOR AIRBORNE RADAR MULTI-TARGET TRACKING VIA DEEP LEARNING NETWORK [J].

Li, Wenna ;

Yang, Ailing ;

Zhang, Lianzhong .

2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, :7417-7420

[35] Deep Reinforcement Learning Applied to a Spherical Robot for Target Tracking [J].

Escorza, Omar ;

Garcia, Gonzalo ;

Fabregas, Ernesto ;

Velastin, Sergio A. ;

Eskandarian, Azim ;

Farias, Gonzalo .

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2025,

[36] Three-dimensional vehicle multi-target tracking based on trajectory optimization [J].

Cai, Hua ;

Kou, Ting-Ting ;

Yang, Yi-Ning ;

Ma, Zhi-Yong ;

Wang, Wei-Gang ;

Sun, Jun-Xi .

Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2024, 54 (08) :2338-2347

[37] Traffic Flow Video Image Recognition and Analysis Based on Multi-Target Tracking Algorithm and Deep Learning [J].

Zou, Songshang ;

Chen, Hao ;

Feng, Hui ;

Xiao, Guangyi ;

Qin, Zhen ;

Cai, Weiwei .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (08) :8762-8775

[38] Collaborative Integrated Navigation for Unmanned Aerial Vehicle Swarms Under Multiple Uncertainties [J].

Zhang, Le ;

Cao, Xiaomeng ;

Su, Mudan ;

Sui, Yeye .

SENSORS, 2025, 25 (03)

[39] An improved method for multi-target tracking [J].

Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China .

Inf. Technol. J., 2007, 5 (725-732) :725-732

[40] Q-learning-based routing inspired by adaptive flocking control for collaborative unmanned aerial vehicle swarms [J].

Alam, Muhammad Morshed ;

Moh, Sangman .

VEHICULAR COMMUNICATIONS, 2023, 40

← 1 2 3 4 5 →