Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement learning

被引:47
作者
Zhou, Wenhong [1 ]
Liu, Zhihong [1 ]
Li, Jie [1 ]
Xu, Xin [1 ]
Shen, Lincheng [1 ]
机构
[1] Natl Univ Def Technol, Coll Intelligence Sci & Technol, Changsha 410073, Peoples R China
基金
中国国家自然科学基金;
关键词
UAV swarms; Multi-target tracking; Multi-agent reinforcement learning; Scalability; Feature representation; ROBOTS; ALGORITHMS; SEARCH;
D O I
10.1016/j.neucom.2021.09.044
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, deep reinforcement learning (DRL) has proved its great potential in multi-agent cooper-ation. However, how to apply DRL to multi-target tracking (MTT) problem for unmanned aerial vehicle (UAV) swarms is challenging: 1) the scale of UAVs may be large, but the existing multi-agent reinforce-ment learning (MARL) methods that rely on global or joint information of all agents suffer from the dimensionality curse; 2) the dimension of each UAV's received information is variable, which is incom-patible with the neural networks with fixed input dimensions; 3) the UAVs are homogeneous and inter-changeable that each UAV's policy should be irrelevant to the permutation of its received information. To this end, we propose a DRL method for UAV swarms to solve the MTT problem. Firstly, a decentralized swarm-oriented Markov Decision Process (MDP) model is presented for UAV swarms, which involves each UAV's local communication and partial observation. Secondly, to achieve better scalability, a car-togram feature representation (FR) is proposed to integrate the variable-dimensional information set into a fixed-shape input variable, and the cartogram FR can also maintain the permutation irrelevance to the information. Then, the double deep Q-learning network with dueling architecture is adapted to the MTT problem, and the experience-sharing training mechanism is adopted to learn the shared cooperative pol-icy for UAV swarms. Extensive experiments are provided and the results show that our method can suc-cessfully learn a cooperative tracking policy for UAV swarms and outperforms the baseline method in the tracking ratio and scalability. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:285 / 297
页数:13
相关论文
共 50 条
[21]   Tracking multi-target and target types using random sets [J].
Tian, Shu-Rong ;
He, You ;
Yi, Xiao .
PROCEEDINGS OF 2006 CIE INTERNATIONAL CONFERENCE ON RADAR, VOLS 1 AND 2, 2006, :1862-+
[22]   Road Map Assisted Multi-Target Tracking Method for Intelligent Vehicle [J].
Tan, Ben ;
Feng, Huizong ;
Xu, Hua ;
Zhou, Mingliang ;
Cen, Ming .
2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, :779-784
[23]   Multi-target data association approach for vehicle tracking in road situation [J].
Dang, H ;
Han, CZ .
2003 IEEE INTELLIGENT TRANSPORTATION SYSTEMS PROCEEDINGS, VOLS. 1 & 2, 2003, :379-383
[24]   MULTI-TARGET TRACKING BY DETECTION [J].
Zeng, Qiaoling ;
Wen, Gongjian ;
Li, Dongdong .
PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2016, :370-374
[25]   PART-BASED MULTI-TARGET TRACKING WITH STRUCTURED LEARNING [J].
Zhu, Da-Yong ;
Zhang, Xin-Li .
2013 10TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2013, :104-107
[26]   Underwater multi-target tracking using imaging sonar [J].
Jing D.-X. ;
Han J. ;
Xu Z.-W. ;
Chen Y. .
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2019, 53 (04) :753-760
[27]   Spatiotemporal KSVD Dictionary Learning for Online Multi-target Tracking [J].
Manh, Huunh ;
Alaghband, Gita .
2018 15TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV), 2018, :150-157
[28]   Learning Optimal Parameters for Multi-target Tracking with Contextual Interactions [J].
Shaofei Wang ;
Charless C. Fowlkes .
International Journal of Computer Vision, 2017, 122 :484-501
[29]   Online learning affinity measure with CovBoost for multi-target tracking [J].
Li, Guorong ;
Huang, Qingming ;
Jiang, Shuqiang ;
Xu, Yingkun ;
Zhang, Weigang .
NEUROCOMPUTING, 2015, 168 :327-335
[30]   COLREG-Compliant Collision Avoidance for Unmanned Surface Vehicle Using Deep Reinforcement Learning [J].
Meyer, Eivind ;
Heiberg, Amalie ;
Rasheed, Adil ;
San, Omer .
IEEE ACCESS, 2020, 8 (08) :165344-165364