UAV-Ground Visual Tracking: A Unified Dataset and Collaborative Learning Approach

被引：6

作者：

Sun, Dengdi ^{[1
,2
,3
]}

Cheng, Leilei ^{[4
]}

Chen, Song ^{[4
]}

Li, Chenglong ^{[1
,2
]}

Xiao, Yun ^{[1
,2
]}

Luo, Bin ^{[4
]}

机构：

[1] Anhui Univ, Minist Educ, Key Lab Intelligent Comp & Signal Proc, Hefei 230601, Peoples R China

[2] Anhui Univ, Sch Artificial Intelligence, Hefei 230601, Peoples R China

[3] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei 230026, Peoples R China

[4] Anhui Univ, Sch Comp Sci & Technol, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Target tracking; Visualization; Autonomous aerial vehicles; Object tracking; Task analysis; Fuses; Video sequences; Visual tracking; transformer; UAV and ground views; benchmark dataset; collaborative learning; OBJECT TRACKING; FUSION; ROBUST;

D O I：

10.1109/TCSVT.2023.3316990

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Visual tracking from the ground view and the UAV view has received increasing attention due to its wide range of practical applications. These two tasks have strong complementary benefits in the description of the target object, such as detailed appearance in the ground view and global motion information in the UAV view, and their combination has the potential to allow the tracking system to be more robust. However, no work has studied this problem in-depth, and it is challenging to accurately combine the ground view information and the UAV view information. To fill the gap and address the challenge, we propose a new computer vision task called UAV-Ground visual tracking. Considering the lack of relevant data and methods, we first propose a unified video dataset called UGVT, which includes 210 pairs of UAV and ground high-resolution video sequences with a total of more than 204K frames, which can be used as a comprehensive evaluation platform for relevant tracking methods. Secondly, based on the newly constructed dataset, we propose a co-learning method called MvCL to fuse the information of ground and UAV views. It first associates the same tracking target in the two views based on cross-attention operation and then fuses the complementary information of the two views. In particular, as a plug-and-play module based on Transformer structure, this method can be flexibly embedded into different tracking frameworks. Extensive experiments are conducted on the newly created dataset. The results demonstrate the effectiveness of the proposed method in improving the robustness of the tracking system compared with 10 state-of-the-art tracking methods and also indicate the prospect and significance of potential UAV-Ground visual tracking research. The dataset is available at: https://github.com/mmic-lcl/Datasets-and-benchmark-code/.

引用

页码：3619 / 3632

页数：14

共 48 条

[41] Online Object Tracking: A Benchmark [J].

Wu, Yi ;

Lim, Jongwoo ;

Yang, Ming-Hsuan .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :2411-2418

[42]

Xu YD, 2020, AAAI CONF ARTIF INTE, V34, P12549

[43] Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework [J].

Ye, Botao ;

Chang, Hong ;

Ma, Bingpeng ;

Shan, Shiguang ;

Chen, Xilin .

COMPUTER VISION, ECCV 2022, PT XXII, 2022, 13682 :341-357

[44]

Ye GN, 2012, PROC CVPR IEEE, P3021, DOI 10.1109/CVPR.2012.6248032

[45] Multi-Modal Fusion for End-to-End RGB-T Tracking [J].

Zhang, Lichao ;

Danelljan, Martin ;

Gonzalez-Garcia, Abel ;

van de Weijer, Joost ;

Khan, Fahad Shahbaz .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :2252-2261

[46] Detection and Tracking Meet Drones Challenge [J].

Zhu, Pengfei ;

Wen, Longyin ;

Du, Dawei ;

Bian, Xiao ;

Fan, Heng ;

Hu, Qinghua ;

Ling, Haibin .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) :7380-7399

[47] RGBT Tracking by Trident Fusion Network [J].

Zhu, Yabin ;

Li, Chenglong ;

Tang, Jin ;

Luo, Bin ;

Wang, Liang .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (02) :579-592

[48]

Zhu Z., 2018, 2018 IEEE International Magnetics Conference (INTERMAG), DOI 10.1109/INTMAG.2018.8508710

← 1 2 3 4 5 →