Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement learning

被引:50
作者
Zhou, Wenhong [1 ]
Liu, Zhihong [1 ]
Li, Jie [1 ]
Xu, Xin [1 ]
Shen, Lincheng [1 ]
机构
[1] Natl Univ Def Technol, Coll Intelligence Sci & Technol, Changsha 410073, Peoples R China
基金
中国国家自然科学基金;
关键词
UAV swarms; Multi-target tracking; Multi-agent reinforcement learning; Scalability; Feature representation; ROBOTS; ALGORITHMS; SEARCH;
D O I
10.1016/j.neucom.2021.09.044
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, deep reinforcement learning (DRL) has proved its great potential in multi-agent cooper-ation. However, how to apply DRL to multi-target tracking (MTT) problem for unmanned aerial vehicle (UAV) swarms is challenging: 1) the scale of UAVs may be large, but the existing multi-agent reinforce-ment learning (MARL) methods that rely on global or joint information of all agents suffer from the dimensionality curse; 2) the dimension of each UAV's received information is variable, which is incom-patible with the neural networks with fixed input dimensions; 3) the UAVs are homogeneous and inter-changeable that each UAV's policy should be irrelevant to the permutation of its received information. To this end, we propose a DRL method for UAV swarms to solve the MTT problem. Firstly, a decentralized swarm-oriented Markov Decision Process (MDP) model is presented for UAV swarms, which involves each UAV's local communication and partial observation. Secondly, to achieve better scalability, a car-togram feature representation (FR) is proposed to integrate the variable-dimensional information set into a fixed-shape input variable, and the cartogram FR can also maintain the permutation irrelevance to the information. Then, the double deep Q-learning network with dueling architecture is adapted to the MTT problem, and the experience-sharing training mechanism is adopted to learn the shared cooperative pol-icy for UAV swarms. Extensive experiments are provided and the results show that our method can suc-cessfully learn a cooperative tracking policy for UAV swarms and outperforms the baseline method in the tracking ratio and scalability. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:285 / 297
页数:13
相关论文
共 50 条
[31]   COLREG-Compliant Collision Avoidance for Unmanned Surface Vehicle Using Deep Reinforcement Learning [J].
Meyer, Eivind ;
Heiberg, Amalie ;
Rasheed, Adil ;
San, Omer .
IEEE ACCESS, 2020, 8 :165344-165364
[32]   Learning Optimal Parameters for Multi-target Tracking with Contextual Interactions [J].
Wang, Shaofei ;
Fowlkes, Charless C. .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2017, 122 (03) :484-501
[33]   Multi-target Tracking using Mixed Spatio-Temporal Features Learning Model [J].
Ge Yinghui ;
Yu Jianjun .
2009 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND LOGISTICS ( ICAL 2009), VOLS 1-3, 2009, :799-+
[34]   IMPROVED DATA ASSOCIATION ALGORITHM FOR AIRBORNE RADAR MULTI-TARGET TRACKING VIA DEEP LEARNING NETWORK [J].
Li, Wenna ;
Yang, Ailing ;
Zhang, Lianzhong .
2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, :7417-7420
[35]   Deep Reinforcement Learning Applied to a Spherical Robot for Target Tracking [J].
Escorza, Omar ;
Garcia, Gonzalo ;
Fabregas, Ernesto ;
Velastin, Sergio A. ;
Eskandarian, Azim ;
Farias, Gonzalo .
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2025,
[36]   Traffic Flow Video Image Recognition and Analysis Based on Multi-Target Tracking Algorithm and Deep Learning [J].
Zou, Songshang ;
Chen, Hao ;
Feng, Hui ;
Xiao, Guangyi ;
Qin, Zhen ;
Cai, Weiwei .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (08) :8762-8775
[37]   Three-dimensional vehicle multi-target tracking based on trajectory optimization [J].
Cai, Hua ;
Kou, Ting-Ting ;
Yang, Yi-Ning ;
Ma, Zhi-Yong ;
Wang, Wei-Gang ;
Sun, Jun-Xi .
Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2024, 54 (08) :2338-2347
[38]   Collaborative Integrated Navigation for Unmanned Aerial Vehicle Swarms Under Multiple Uncertainties [J].
Zhang, Le ;
Cao, Xiaomeng ;
Su, Mudan ;
Sui, Yeye .
SENSORS, 2025, 25 (03)
[39]   An improved method for multi-target tracking [J].
Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China .
Inf. Technol. J., 2007, 5 (725-732) :725-732
[40]   Q-learning-based routing inspired by adaptive flocking control for collaborative unmanned aerial vehicle swarms [J].
Alam, Muhammad Morshed ;
Moh, Sangman .
VEHICULAR COMMUNICATIONS, 2023, 40