TrackingMamba: Visual State Space Model for Object Tracking

被引:2
|
作者
Wang, Qingwang [1 ,2 ]
Zhou, Liyao [1 ,2 ]
Jin, Pengcheng [1 ,2 ]
Xin, Qu [1 ,2 ]
Zhong, Hangwei [1 ,2 ]
Song, Haochen [1 ,2 ]
Shen, Tao [1 ,2 ]
机构
[1] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Kunming 650500, Peoples R China
[2] Kunming Univ Sci & Technol, Yunnan Key Lab Comp Technol Applicat, Kunming 650500, Peoples R China
基金
中国国家自然科学基金;
关键词
Object tracking; Autonomous aerial vehicles; Transformers; Feature extraction; Computational modeling; Accuracy; Visualization; Jungle scenes; Mamba; object tracking; UAV remote sensing;
D O I
10.1109/JSTARS.2024.3458938
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, UAV object tracking has provided technical support across various fields. Most existing work relies on convolutional neural networks (CNNs) or visual transformers. However, CNNs have limited receptive fields, resulting in suboptimal performance, while transformers require substantial computational resources, making training and inference challenging. Mountainous and jungle environments-critical components of the Earth's surface and key scenarios for UAV object tracking-present unique challenges due to steep terrain, dense vegetation, and rapidly changing weather conditions, which complicate UAV tracking. The lack of relevant datasets further reduces tracking accuracy. This article introduces a new tracking framework based on a state-space model called TrackingMamba, which uses a single-stream tracking architecture with Vision Mamba as its backbone. TrackingMamba not only matches transformer-based trackers in global feature extraction and long-range dependence modeling but also maintains computational efficiency with linear growth. Compared to other advanced trackers, TrackingMamba delivers higher accuracy with a simpler model framework, fewer parameters, and reduced FLOPs. Specifically, on the UAV123 benchmark, TrackingMamba outperforms the baseline model OSTtrack-256, improving AUC by 2.59% and Precision by 4.42%, while reducing parameters by 95.52% and FLOPs by 95.02%. The article also evaluates the performance and shortcomings of TrackingMamba and other advanced trackers in the complex and critical context of jungle environments, and it explores potential future research directions in UAV jungle object tracking.
引用
收藏
页码:16744 / 16754
页数:11
相关论文
共 50 条
  • [31] Parallel Tracker for Visual Object Tracking
    Liang, Xiangluan
    Lai, Ru
    Bi, Luzheng
    PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC), 2018, : 5676 - 5681
  • [32] Visual Object Tracking: The Initialisation Problem
    De Ath, George
    Everson, Richard
    2018 15TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV), 2018, : 142 - 149
  • [33] Online Multi-Object Tracking With Visual and Radar Features
    Bae, Seung-Hwan
    IEEE ACCESS, 2020, 8 (08): : 90324 - 90339
  • [34] Visual Object Tracking With Mutual Affinity Aligned to Human Intuition
    Zeng, Guotian
    Zeng, Bi
    Wei, Qingmao
    Hu, Huiting
    Zhang, Hong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10055 - 10068
  • [35] Manipulating Template Pixels for Model Adaptation of Siamese Visual Tracking
    Li, Zhenbang
    Li, Bing
    Gao, Jin
    Li, Liang
    Hu, Weiming
    IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 1690 - 1694
  • [36] Building a Robust Appearance Model for Object Tracking
    Yao, Zhijun
    Feng, Bin
    Wang, Junwei
    Liu, Wenyu
    2009 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, VOL III, PROCEEDINGS, 2009, : 471 - 475
  • [37] Application of Binary Tree Model in Object Tracking
    Zheng Y.
    Li R.
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2020, 48 (01): : 42 - 50
  • [38] Satellite Video Object Tracking Based on Location Prompts
    Wang, Jiahao
    Liu, Fang
    Jiao, Licheng
    Gao, Yingjia
    Wang, Hao
    Li, Lingling
    Chen, Puhua
    Liu, Xu
    Li, Shuo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 6253 - 6264
  • [39] A four dukkha state-space model for hand tracking
    Lim, Kian Ming
    Tan, Alan W. C.
    Tan, Shing Chiang
    NEUROCOMPUTING, 2017, 267 : 311 - 319
  • [40] Variable scale learning for visual object tracking
    Xuedong He
    Lu Zhao
    Calvin Yu-Chian Chen
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 : 3315 - 3330