Adaptive Multi-Scale Transformer Tracker for Satellite Videos

被引:0
|
作者
Zhang, Xin [1 ]
Jiao, Licheng [1 ]
Li, Lingling [1 ]
Liu, Xu [1 ]
Liu, Fang [1 ]
Ma, Wenping [1 ]
Yang, Shuyuan [1 ]
机构
[1] Xidian Univ, Int Res Ctr Intelligent Percept & Computat, Sch Artificial Intelligence,Minist Educ China, Key Lab Intelligent Percept & Image Understanding,, Xian 710071, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
基金
中国国家自然科学基金;
关键词
Feature extraction; Transformers; Satellites; Target tracking; Videos; Video tracking; Computational modeling; Adaptive Transformer; multi-scale Transformer (MT); object regression; satellite video tracking; OBJECT TRACKING;
D O I
10.1109/TGRS.2024.3441038
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Satellite video tracking tasks are often characterized by blurred foreground boundaries in vast scenes, a wide range of targets varying in scale, and irregular changes in appearance. These challenges significantly impact the optimization of robust tracker performance. Therefore, it is imperative to extract diverse features with dynamic adaptive learning capabilities for the target being tracked in each sequence. In this article, we explore a novel adaptive multi-scale Transformer (MT) tracker for satellite videos to explore the potential spatiotemporal information of the target effectively. Specifically, a multi-scale spatial Transformer (MSST) is designed to leverage stage-by-stage spatial reduction and channel doubling, thereby enhancing the representation capabilities for the tracked target. In dynamic feature learning, an adaptive temporal Transformer (ATT) is then introduced based on multiple cross attentions, which analyzes the adaptive learning capacity for the dynamic target. It analyzes the weight proportion of different attentions automatically in the specific sequence through the learnable parameters. Finally, a multi-scale feature (MSF) regression module is crafted to improve the positioning accuracy of targets with low pixel counts in satellite scenes. This module accomplishes precise annotation of target boxes by effectively fusing features from diverse stages. We evaluate the proposed tracker performance on several public satellite datasets, including SatSOT, SV248S, and VISO. Experimental results show that the performance of our model can be comparable to the state-of-the-art trackers.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Object Tracking in Satellite Videos Based on a Multiframe Optical Flow Tracker
    Du, Bo
    Cai, Shihan
    Wu, Chen
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2019, 12 (08) : 3043 - 3055
  • [32] Adaptive Multi-Scale Tracking Target Algorithm through Drone
    He, Qiusheng
    Shao, Xiuyan
    Chen, Wei
    Li, Xiaoyun
    Yang, Xiao
    Sun, Tongfeng
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2019, E102B (10) : 1998 - 2005
  • [33] Adaptive multi-scale color feature target tracking algorithm
    Li Xiao-yun
    He Qiu-sheng
    Zhang Wei-feng
    Liang Hui-hui
    Chen Wei
    CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2019, 34 (03) : 291 - 301
  • [34] HPMG-Transformer: HP Filter Multi-Scale Gaussian Transformer for Liquor Stock Movement Prediction
    Huang, Lili
    IEEE ACCESS, 2024, 12 : 63885 - 63894
  • [35] PlaceFormer: Transformer-Based Visual Place Recognition Using Multi-Scale Patch Selection and Fusion
    Kannan, Shyam Sundar
    Min, Byung-Cheol
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (07): : 6552 - 6559
  • [36] MG-Trans: Multi-Scale Graph Transformer With Information Bottleneck for Whole Slide Image Classification
    Shi, Jiangbo
    Tang, Lufei
    Gao, Zeyu
    Li, Yang
    Wang, Chunbao
    Gong, Tieliang
    Li, Chen
    Fu, Huazhu
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (12) : 3871 - 3883
  • [37] RNGDet++: Road Network Graph Detection by Transformer With Instance Segmentation and Multi-Scale Features Enhancement
    Xu, Zhenhua
    Liu, Yuxuan
    Sun, Yuxiang
    Liu, Ming
    Wang, Lujia
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (05) : 2991 - 2998
  • [38] TMA-Net: A Transformer-Based Multi-Scale Attention Network for Surgical Instrument Segmentation
    Yang, Lei
    Wang, Hongyong
    Gu, Yuge
    Bian, Guibin
    Liu, Yanhong
    Yu, Hongnian
    IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2023, 5 (02): : 323 - 334
  • [39] Bridge Graph Attention Based Graph Convolution Network With Multi-Scale Transformer for EEG Emotion Recognition
    Yan, Huachao
    Guo, Kailing
    Xing, Xiaofen
    Xu, Xiangmin
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (04) : 2042 - 2054
  • [40] Combining Multi-Scale U-Net With Transformer for Welding Defect Detection of Oil/Gas Pipeline
    Zhang, Shanwen
    Wang, Xuqi
    Zhang, Ting
    IEEE ACCESS, 2025, 13 : 5437 - 5445