Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement learning

被引：50

作者：

Zhou, Wenhong ^{[1
]}

Liu, Zhihong ^{[1
]}

Li, Jie ^{[1
]}

Xu, Xin ^{[1
]}

Shen, Lincheng ^{[1
]}

机构：

[1] Natl Univ Def Technol, Coll Intelligence Sci & Technol, Changsha 410073, Peoples R China

来源：

NEUROCOMPUTING | 2021年 / 466卷

基金：

中国国家自然科学基金;

关键词：

UAV swarms; Multi-target tracking; Multi-agent reinforcement learning; Scalability; Feature representation; ROBOTS; ALGORITHMS; SEARCH;

D O I：

10.1016/j.neucom.2021.09.044

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, deep reinforcement learning (DRL) has proved its great potential in multi-agent cooper-ation. However, how to apply DRL to multi-target tracking (MTT) problem for unmanned aerial vehicle (UAV) swarms is challenging: 1) the scale of UAVs may be large, but the existing multi-agent reinforce-ment learning (MARL) methods that rely on global or joint information of all agents suffer from the dimensionality curse; 2) the dimension of each UAV's received information is variable, which is incom-patible with the neural networks with fixed input dimensions; 3) the UAVs are homogeneous and inter-changeable that each UAV's policy should be irrelevant to the permutation of its received information. To this end, we propose a DRL method for UAV swarms to solve the MTT problem. Firstly, a decentralized swarm-oriented Markov Decision Process (MDP) model is presented for UAV swarms, which involves each UAV's local communication and partial observation. Secondly, to achieve better scalability, a car-togram feature representation (FR) is proposed to integrate the variable-dimensional information set into a fixed-shape input variable, and the cartogram FR can also maintain the permutation irrelevance to the information. Then, the double deep Q-learning network with dueling architecture is adapted to the MTT problem, and the experience-sharing training mechanism is adopted to learn the shared cooperative pol-icy for UAV swarms. Extensive experiments are provided and the results show that our method can suc-cessfully learn a cooperative tracking policy for UAV swarms and outperforms the baseline method in the tracking ratio and scalability. (c) 2021 Elsevier B.V. All rights reserved.

引用

页码：285 / 297

页数：13

共 50 条

[21] Multi-Target Tracking Using a Swarm of UAVs by Q-learning Algorithm [J].

Soleymani, Seyed Ahmad ;

Goudarzi, Shidrokh ;

Liu, Xingchi ;

Mihaylova, Lyudmila ;

Wang, Wenwu ;

Xiao, Pei .

2023 SENSOR SIGNAL PROCESSING FOR DEFENCE CONFERENCE, SSPD, 2023, :41-45

[22] Tracking multi-target and target types using random sets [J].

Tian, Shu-Rong ;

He, You ;

Yi, Xiao .

PROCEEDINGS OF 2006 CIE INTERNATIONAL CONFERENCE ON RADAR, VOLS 1 AND 2, 2006, :1862-+

[23] Road Map Assisted Multi-Target Tracking Method for Intelligent Vehicle [J].

Tan, Ben ;

Feng, Huizong ;

Xu, Hua ;

Zhou, Mingliang ;

Cen, Ming .

2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, :779-784

[24] MULTI-TARGET TRACKING BY DETECTION [J].

Zeng, Qiaoling ;

Wen, Gongjian ;

Li, Dongdong .

PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2016, :370-374

[25] Multi-target data association approach for vehicle tracking in road situation [J].

Dang, H ;

Han, CZ .

2003 IEEE INTELLIGENT TRANSPORTATION SYSTEMS PROCEEDINGS, VOLS. 1 & 2, 2003, :379-383

[26] Underwater multi-target tracking using imaging sonar [J].

Jing D.-X. ;

Han J. ;

Xu Z.-W. ;

Chen Y. .

Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2019, 53 (04) :753-760

[27] PART-BASED MULTI-TARGET TRACKING WITH STRUCTURED LEARNING [J].

Zhu, Da-Yong ;

Zhang, Xin-Li .

2013 10TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2013, :104-107

[28] Spatiotemporal KSVD Dictionary Learning for Online Multi-target Tracking [J].

Manh, Huunh ;

Alaghband, Gita .

2018 15TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV), 2018, :150-157

[29] Online learning affinity measure with CovBoost for multi-target tracking [J].

Li, Guorong ;

Huang, Qingming ;

Jiang, Shuqiang ;

Xu, Yingkun ;

Zhang, Weigang .

NEUROCOMPUTING, 2015, 168 :327-335

[30] Learning Optimal Parameters for Multi-target Tracking with Contextual Interactions [J].

Shaofei Wang ;

Charless C. Fowlkes .

International Journal of Computer Vision, 2017, 122 :484-501

← 1 2 3 4 5 →