TCTrack: Temporal Contexts for Aerial Tracking

被引:153
作者
Cao, Ziang [1 ]
Huang, Ziyuan [2 ]
Pan, Liang [3 ]
Zhang, Shiwei [4 ]
Liu, Ziwei [3 ]
Fu, Changhong [1 ]
机构
[1] Tongji Univ, Shanghai, Peoples R China
[2] Natl Univ Singapore, Singapore, Singapore
[3] Nanyang Technol Univ, S Lab, Singapore, Singapore
[4] Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
基金
上海市自然科学基金; 中国国家自然科学基金;
关键词
ROBUST;
D O I
10.1109/CVPR52688.2022.01438
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporal contexts among consecutive frames are far from being fully utilized in existing visual trackers. In this work, we present TCTrack(1), a comprehensive framework to fully exploit temporal contexts for aerial tracking. The temporal contexts are incorporated at two levels: the extraction of features and the refinement of similarity maps. Specifically, for feature extraction, an online temporally adaptive convolution is proposed to enhance the spatial features using temporal information, which is achieved by dynamically calibrating the convolution weights according to the previous frames. For similarity map refinement, we propose an adaptive temporal transformer, which first effectively encodes temporal knowledge in a memory-efficient way, before the temporal knowledge is decoded for accurate adjustment of the similarity map. TCTrack is effective and efficient: evaluation on four aerial tracking benchmarks shows its impressive performance; real-world UAV tests show its high speed of over 27 FPS on NVIDIA Jetson AGX Xavier.
引用
收藏
页码:14778 / 14788
页数:11
相关论文
共 79 条
[1]  
[Anonymous], 2014, P BRIT MACHINE VISIO, DOI DOI 10.1073/pnas.1031596100
[2]   Staple: Complementary Learners for Real-Time Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Golodetz, Stuart ;
Miksik, Ondrej ;
Torr, Philip H. S. .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1401-1409
[3]   Fully-Convolutional Siamese Networks for Object Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Henriques, Joao F. ;
Vedaldi, Andrea ;
Torr, Philip H. S. .
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865
[4]  
Bhat Goutam, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12368), P205, DOI 10.1007/978-3-030-58592-1_13
[5]   Learning Discriminative Model Prediction for Tracking [J].
Bhat, Goutam ;
Danelljan, Martin ;
Van Gool, Luc ;
Timofte, Radu .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6181-6190
[6]  
Bolme DS, 2010, PROC CVPR IEEE, P2544, DOI 10.1109/CVPR.2010.5539960
[7]   HiFT: Hierarchical Feature Transformer for Aerial Tracking [J].
Cao, Ziang ;
Fu, Changhong ;
Ye, Junjie ;
Li, Bowen ;
Li, Yiming .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :15437-15446
[8]   Transformer Tracking [J].
Chen, Xin ;
Yan, Bin ;
Zhu, Jiawen ;
Wang, Dong ;
Yang, Xiaoyun ;
Lu, Huchuan .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8122-8131
[9]   Siamese Box Adaptive Network for Visual Tracking [J].
Chen, Zedu ;
Zhong, Bineng ;
Li, Guorong ;
Zhang, Shengping ;
Ji, Rongrong .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :6667-6676
[10]   Visual Tracking via Adaptive Spatially-Regularized Correlation Filters [J].
Dai, Kenan ;
Wang, Dong ;
Lu, Huchuan ;
Sun, Chong ;
Li, Jianhua .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4665-4674