TM2B: Transformer-Based Motion-to-Box Network for 3D Single Object Tracking on Point Clouds

被引：0

作者：

Xu, Anqi ^{[1
]}

Nie, Jiahao ^{[1
]}

He, Zhiwei ^{[1
]}

Lv, Xudong ^{[1
]}

机构：

[1] Sch Hangzhou Dianzi Univ, Hangzhou 310018, Peoples R China

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2024年 / 9卷 / 08期

关键词：

Transformers; Accuracy; Three-dimensional displays; Target tracking; Object tracking; Feature extraction; Point cloud compression; 3D single object tracking; motion-to-box; transformer;

D O I：

10.1109/LRA.2024.3418274

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

3D single object tracking plays a crucial role in numerous applications such as autonomous driving. Recent trackers based on motion-centric paradigm perform well as they exploit motion cues to infer target relative motion across successive frames, which effectively overcome significant appearance variations of targets and distractors caused by occlusion. However, such a motion-centric paradigm tends to require multi-stage motion-to-box to refine the motion cues, which suffers from tedious hyper-parameter tuning and elaborate subtask designs. In this letter, we propose a novel transformer-based motion-to-box network (TM2B), which employs a learnable relation modeling transformer (LRMT) to generate accurate motion cues without multi-stage refinements. Our proposed LRMT contains two novel attention mechanisms: hierarchical interactive attention and learnable query attention. The former attention builds a learnable number-fixed sampling sets for each query on multi-scale feature maps, enabling each query to adaptively select prominent sampling elements, thus effectively encoding multi-scale features in a lightweight manner, while the latter calculates the weighted sum of the encoded features with learnable global query, enabling to extract valuable motion cues from all available features, thereby achieving accurate object tracking. Extensive experiments demonstrate that TM2B achieves state-of-the-art performance on KITTI, NuScenes and Waymo Open Dataset, while obtaining a significant improvement in inference speed over previous leading methods, achieving 56.8 FPS on a single NVIDIA 1080Ti GPU. The code is available at TM2B.

引用

页码：7078 / 7085

页数：8

共 50 条

[1] Point Transformer-Based Salient Object Detection Network for 3-D Measurement Point Clouds
Wei, Zeyong
Chen, Baian
Wang, Weiming
Chen, Honghua
Wei, Mingqiang
Li, Jonathan
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 11
[2] OST: Efficient One-Stream Network for 3D Single Object Tracking in Point Clouds
Zhao, Xiantong
Han, Yinan
Tian, Shengjing
Liu, Jian
Liu, Xiuping
IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 990 - 1002
[3] 3D Siamese Transformer Network for Single Object Tracking on Point Clouds
Hui, Le
Wang, Lingpeng
Tang, Linghua
Lan, Kaihao
Xie, Jin
Yang, Jian
COMPUTER VISION - ECCV 2022, PT II, 2022, 13662 : 293 - 310
[4] AnchorPoint: Query Design for Transformer-Based 3D Object Detection and Tracking
Liu, Hao
Ma, Yanni
Wang, Hanyun
Zhang, Chaobo
Guo, Yulan
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (10) : 10988 - 11000
[5] Motion-to-Matching: A Mixed Paradigm for 3D Single Object Tracking
Li, Zhiheng
Lin, Yu
Cui, Yubo
Li, Shuo
Fang, Zheng
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (02) : 1468 - 1475
[6] Multi-Level Structure-Enhanced Network for 3D Single Object Tracking in Sparse Point Clouds
Wu, Qiaoyun
Sun, Changyin
Wang, Jun
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (01) : 9 - 16
[7] Exploring Point-BEV Fusion for 3D Point Cloud Object Tracking With Transformer
Luo, Zhipeng
Zhou, Changqing
Pan, Liang
Zhang, Gongjie
Liu, Tianrui
Luo, Yueru
Zhao, Haiyu
Liu, Ziwei
Lu, Shijian
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (09) : 5921 - 5935
[8] Transformer-Based Optimized Multimodal Fusion for 3D Object Detection in Autonomous Driving
Alaba, Simegnew Yihunie
Ball, John E.
IEEE ACCESS, 2024, 12 : 50165 - 50176
[9] Graph Neural Network and Spatiotemporal Transformer Attention for 3D Video Object Detection From Point Clouds
Yin, Junbo
Shen, Jianbing
Gao, Xin
Crandall, David J.
Yang, Ruigang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 9822 - 9835
[10] A Lightweight and Detector-Free 3D Single Object Tracker on Point Clouds
Xia, Yan
Wu, Qiangqiang
Li, Wei
Chan, Antoni B. B.
Stilla, Uwe
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (05) : 5543 - 5554

← 1 2 3 4 5 →