Adaptive Multi-Scale Transformer Tracker for Satellite Videos

被引：0

作者：

Zhang, Xin ^{[1
]}

Jiao, Licheng ^{[1
]}

Li, Lingling ^{[1
]}

Liu, Xu ^{[1
]}

Liu, Fang ^{[1
]}

Ma, Wenping ^{[1
]}

Yang, Shuyuan ^{[1
]}

机构：

[1] Xidian Univ, Int Res Ctr Intelligent Percept & Computat, Sch Artificial Intelligence,Minist Educ China, Key Lab Intelligent Percept & Image Understanding,, Xian 710071, Peoples R China

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷

基金：

中国国家自然科学基金;

关键词：

Feature extraction; Transformers; Satellites; Target tracking; Videos; Video tracking; Computational modeling; Adaptive Transformer; multi-scale Transformer (MT); object regression; satellite video tracking; OBJECT TRACKING;

D O I：

10.1109/TGRS.2024.3441038

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

Satellite video tracking tasks are often characterized by blurred foreground boundaries in vast scenes, a wide range of targets varying in scale, and irregular changes in appearance. These challenges significantly impact the optimization of robust tracker performance. Therefore, it is imperative to extract diverse features with dynamic adaptive learning capabilities for the target being tracked in each sequence. In this article, we explore a novel adaptive multi-scale Transformer (MT) tracker for satellite videos to explore the potential spatiotemporal information of the target effectively. Specifically, a multi-scale spatial Transformer (MSST) is designed to leverage stage-by-stage spatial reduction and channel doubling, thereby enhancing the representation capabilities for the tracked target. In dynamic feature learning, an adaptive temporal Transformer (ATT) is then introduced based on multiple cross attentions, which analyzes the adaptive learning capacity for the dynamic target. It analyzes the weight proportion of different attentions automatically in the specific sequence through the learnable parameters. Finally, a multi-scale feature (MSF) regression module is crafted to improve the positioning accuracy of targets with low pixel counts in satellite scenes. This module accomplishes precise annotation of target boxes by effectively fusing features from diverse stages. We evaluate the proposed tracker performance on several public satellite datasets, including SatSOT, SV248S, and VISO. Experimental results show that the performance of our model can be comparable to the state-of-the-art trackers.

引用

页数：16

共 50 条

[31] Object Tracking in Satellite Videos Based on a Multiframe Optical Flow Tracker
Du, Bo
Cai, Shihan
Wu, Chen
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2019, 12 (08) : 3043 - 3055
[32] Adaptive Multi-Scale Tracking Target Algorithm through Drone
He, Qiusheng
Shao, Xiuyan
Chen, Wei
Li, Xiaoyun
Yang, Xiao
Sun, Tongfeng
IEICE TRANSACTIONS ON COMMUNICATIONS, 2019, E102B (10) : 1998 - 2005
[33] Adaptive multi-scale color feature target tracking algorithm
Li Xiao-yun
He Qiu-sheng
Zhang Wei-feng
Liang Hui-hui
Chen Wei
CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2019, 34 (03) : 291 - 301
[34] HPMG-Transformer: HP Filter Multi-Scale Gaussian Transformer for Liquor Stock Movement Prediction
Huang, Lili
IEEE ACCESS, 2024, 12 : 63885 - 63894
[35] PlaceFormer: Transformer-Based Visual Place Recognition Using Multi-Scale Patch Selection and Fusion
Kannan, Shyam Sundar
Min, Byung-Cheol
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (07): : 6552 - 6559
[36] MG-Trans: Multi-Scale Graph Transformer With Information Bottleneck for Whole Slide Image Classification
Shi, Jiangbo
Tang, Lufei
Gao, Zeyu
Li, Yang
Wang, Chunbao
Gong, Tieliang
Li, Chen
Fu, Huazhu
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (12) : 3871 - 3883
[37] RNGDet++: Road Network Graph Detection by Transformer With Instance Segmentation and Multi-Scale Features Enhancement
Xu, Zhenhua
Liu, Yuxuan
Sun, Yuxiang
Liu, Ming
Wang, Lujia
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (05) : 2991 - 2998
[38] TMA-Net: A Transformer-Based Multi-Scale Attention Network for Surgical Instrument Segmentation
Yang, Lei
Wang, Hongyong
Gu, Yuge
Bian, Guibin
Liu, Yanhong
Yu, Hongnian
IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2023, 5 (02): : 323 - 334
[39] Bridge Graph Attention Based Graph Convolution Network With Multi-Scale Transformer for EEG Emotion Recognition
Yan, Huachao
Guo, Kailing
Xing, Xiaofen
Xu, Xiangmin
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (04) : 2042 - 2054
[40] Combining Multi-Scale U-Net With Transformer for Welding Defect Detection of Oil/Gas Pipeline
Zhang, Shanwen
Wang, Xuqi
Zhang, Ting
IEEE ACCESS, 2025, 13 : 5437 - 5445

← 1 2 3 4 5 →