共 3 条
TLSH-MOT: Drone-View Video Multiple Object Tracking via Transformer-Based Locally Sensitive Hash
被引:0
|作者:
Yuan, Yubin
[1
]
Wu, Yiquan
[1
]
Zhao, Langyue
[1
]
Liu, Yuqi
[1
]
Pang, Yaxuan
[1
]
机构:
[1] Nanjing Univ Aeronaut & Astronaut, Coll Elect & Informat Engn, Nanjing 211106, Peoples R China
来源:
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
|
2025年
/
63卷
基金:
中国国家自然科学基金;
关键词:
Remote sensing;
Transformers;
Feature extraction;
Object tracking;
Accuracy;
Trajectory;
Sensors;
Computer vision;
Video tracking;
Surveillance;
Local sensitive hash (LSH);
multiple object tracking (MOT);
spatiotemporal memory (STM);
Transformer;
D O I:
10.1109/TGRS.2025.3545081
中图分类号:
P3 [地球物理学];
P59 [地球化学];
学科分类号:
0708 ;
070902 ;
摘要:
Multiple object tracking (MOT) plays an essential role in drone-view remote sensing applications, such as urban management, emergency rescue, and maritime monitoring. However, due to large variations in object scale and position, the frequent feature loss across frames, and difficulties in matching, traditional methods struggle to achieve high-tracking accuracy in such challenging environments. To address these issues, we propose a Transformer-based locally sensitive hash MOT (TLSH-MOT) method in drone-view remote sensing scenarios. First, a frame-level feature extraction and enhancement module is introduced, integrating a nominee proposal generation (NPG) unit and a tilt convolutional vision Transformer (ViT), which enables adaptive detection of objects across varying scales and perspectives. Next, a spatiotemporal memory (STM) structure is designed to mitigate instantaneous environmental interference and periodic changes using short-term and long-term memory blocks, thereby enhancing tracking stability under complex meteorological conditions. In addition, a temporal enhancement feature decoder (TEFD) fuses multisource feature information to better understand the motion patterns of remote sensing objects. Finally, a local sensitive hash (LSH) IDLinker ensures efficient feature matching, significantly improving trajectory association in large-scale monitoring scenarios. Experimental results show that TLSH-MOT achieves MOT accuracy of 40.7% and 62.2% on VisDrone and UAVDT datasets, respectively, which verifies the superiority of TLSH-MOT in the remote sensing video tracking field. The framework's code is released at: https://github.com/YubinYuan/TLSH-MOT.
引用
收藏
页数:16
相关论文