Adaptive Multi-Scale Transformer Tracker for Satellite Videos

被引:0
|
作者
Zhang, Xin [1 ]
Jiao, Licheng [1 ]
Li, Lingling [1 ]
Liu, Xu [1 ]
Liu, Fang [1 ]
Ma, Wenping [1 ]
Yang, Shuyuan [1 ]
机构
[1] Xidian Univ, Int Res Ctr Intelligent Percept & Computat, Sch Artificial Intelligence,Minist Educ China, Key Lab Intelligent Percept & Image Understanding,, Xian 710071, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
基金
中国国家自然科学基金;
关键词
Feature extraction; Transformers; Satellites; Target tracking; Videos; Video tracking; Computational modeling; Adaptive Transformer; multi-scale Transformer (MT); object regression; satellite video tracking; OBJECT TRACKING;
D O I
10.1109/TGRS.2024.3441038
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Satellite video tracking tasks are often characterized by blurred foreground boundaries in vast scenes, a wide range of targets varying in scale, and irregular changes in appearance. These challenges significantly impact the optimization of robust tracker performance. Therefore, it is imperative to extract diverse features with dynamic adaptive learning capabilities for the target being tracked in each sequence. In this article, we explore a novel adaptive multi-scale Transformer (MT) tracker for satellite videos to explore the potential spatiotemporal information of the target effectively. Specifically, a multi-scale spatial Transformer (MSST) is designed to leverage stage-by-stage spatial reduction and channel doubling, thereby enhancing the representation capabilities for the tracked target. In dynamic feature learning, an adaptive temporal Transformer (ATT) is then introduced based on multiple cross attentions, which analyzes the adaptive learning capacity for the dynamic target. It analyzes the weight proportion of different attentions automatically in the specific sequence through the learnable parameters. Finally, a multi-scale feature (MSF) regression module is crafted to improve the positioning accuracy of targets with low pixel counts in satellite scenes. This module accomplishes precise annotation of target boxes by effectively fusing features from diverse stages. We evaluate the proposed tracker performance on several public satellite datasets, including SatSOT, SV248S, and VISO. Experimental results show that the performance of our model can be comparable to the state-of-the-art trackers.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] DCMSTRD: End-to-end Dense Captioning via Multi-Scale Transformer Decoding
    Shao, Zhuang
    Han, Jungong
    Debattista, Kurt
    Pang, Yanwei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7581 - 7593
  • [22] Mixed-Type Wafer Defect Recognition With Multi-Scale Information Fusion Transformer
    Wei, Yuxiang
    Wang, Huan
    IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, 2022, 35 (02) : 341 - 352
  • [23] RI-ViT: A Multi-Scale Hybrid Method Based on Vision Transformer for Breast Cancer Detection in Histopathological Images
    Monjezi, Ehsan
    Akbarizadeh, Gholamreza
    Ansari-Asl, Karim
    IEEE ACCESS, 2024, 12 : 186074 - 186086
  • [24] Multi-Target Tracking for Satellite Videos Guided by Spatial-Temporal Proximity and Topological Relationships
    Hong, Jianzhi
    Wang, Taoyang
    Han, Yuqi
    Wei, Tong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [25] A Robust Image Semantic Communication System With Multi-Scale Vision Transformer
    Peng, Xiang
    Qin, Zhijin
    Tao, Xiaoming
    Lu, Jianhua
    Letaief, Khaled B.
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2025, 43 (04) : 1278 - 1291
  • [26] Multi-Scale Spatial-Temporal Transformer for Meteorological Variable Forecasting
    Li, Tian-Bao
    Su, Yu-Ting
    Song, Dan
    Li, Wen-Hui
    Wei, Zhi-Qiang
    Liu, An-An
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2474 - 2486
  • [27] MSTD: A Multi-Scale Transformer-Based Method to Diagnose Benign and Malignant Lung Nodules
    Zhao, Xiaoyu
    Li, Jiao
    Qi, Man
    Chen, Xuxin
    Chen, Wei
    Li, Yongqun
    Liu, Qi
    Tang, Jiajia
    Han, Zhihai
    Zhang, Chunyang
    IEEE ACCESS, 2025, 13 : 16182 - 16195
  • [28] MUSTER: A Multi-Scale Transformer-Based Decoder for Semantic Segmentation
    Xu, Jing
    Shi, Wentao
    Gao, Pan
    Li, Qizhu
    Wang, Zhengwei
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025, 9 (01): : 202 - 212
  • [29] A Multi-Scale Kernel Correlation Filter Tracker with Feature Integration and Robust Model Updater
    Xu, Fulai
    Wang, Hongpeng
    Song, Yulin
    Liu, Jingtai
    2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, : 1934 - 1939
  • [30] Multi-Scale and Multi-Branch Transformer Network for Remaining Useful Life Prediction in Ion Mill Etching Process
    Yuan, Zengwei
    Wang, Rui
    IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, 2024, 37 (01) : 67 - 75