Multi-modal multi-task feature fusion for RGBT tracking

被引:25
作者
Cai, Yujue [1 ]
Sui, Xiubao [1 ]
Gu, Guohua [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Elect & Opt Engn, Nanjing 210014, Peoples R China
基金
中国国家自然科学基金;
关键词
RGBT tracking; Auxiliary learning; Contrastive learning; Semantic matching; Instance segmentation; NETWORK;
D O I
10.1016/j.inffus.2023.101816
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGBT tracking has received more and more attention in recent years, and in this paper, we propose a multi-task auxiliary learning framework for RGBT tracking. Specifically, we simplify the tracking task to an instance classification task and make it the primary task of the framework. We designed three auxiliary tasks and used a hard-parameter sharing approach to jointly train multiple tasks, hoping that the primary task would benefit from them. The three auxiliary tasks are contrastive instance discrimination, one-shot instance segmentation, and instance semantic matching. The contrastive instance discrimination method promotes the classification process of the primary task by constraining the features in the representation space. One-shot instance segmentation trains the network in a weakly supervised way to focus on more fine-grained features. In addition, in order to make the network pay more attention to the invariant features of instance target during tracking, we introduce a semantic matching task to alleviate the model drift problem caused by time change. Based on the results on three RGBT tracking benchmarks, the proposed framework is not inferior to the state-of-the-art trackers.
引用
收藏
页数:17
相关论文
共 72 条
[41]  
Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
[42]  
Sohn K., 2023, WINTER C APPL COMPUT, P5479
[43]   Learning From Noisy Labels With Deep Neural Networks: A Survey [J].
Song, Hwanjun ;
Kim, Minseok ;
Park, Dongmin ;
Shin, Yooju ;
Lee, Jae-Gil .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) :8135-8153
[44]  
Srivastava N, 2014, J MACH LEARN RES, V15, P1929
[45]   Ranking-Based Siamese Visual Tracking [J].
Tang, Feng ;
Ling, Qiang .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :8731-8740
[46]   Contrastive Multiview Coding [J].
Tian, Yonglong ;
Krishnan, Dilip ;
Isola, Phillip .
COMPUTER VISION - ECCV 2020, PT XI, 2020, 12356 :776-794
[47]   Boxlnst: High-Performance Instance Segmentation with Box Annotations [J].
Tian, Zhi ;
Shen, Chunhua ;
Wang, Xinlong ;
Chen, Hao .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :5439-5448
[48]  
van den Oord A, 2019, Arxiv, DOI arXiv:1807.03748
[49]  
van der Maaten L, 2008, J MACH LEARN RES, V9, P2579
[50]   Cross-Modal Pattern-Propagation for RGB-T Tracking [J].
Wang, Chaoqun ;
Xu, Chunyan ;
Cui, Zhen ;
Zhou, Ling ;
Zhang, Tong ;
Zhang, Xiaoya ;
Yang, Jian .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :7062-7071