STS-SLAM: Joint Visual SLAM and Multi-Object Tracking Based on Spatio-Temporal Similarity

被引：0

作者：

Peng S. ^{[1
]}

Ran T. ^{[1
]}

Zhang J. ^{[1
]}

Xiao W. ^{[1
]}

Yuan L. ^{[2
]}

机构：

[1] Xinjiang University, School of Mechanical Engineering, Urumqi

[2] Shanghai Jiao Tong University, ICCI, Shanghai

来源：

IEEE Transactions on Intelligent Vehicles | 2025年 / 10卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Dynamic visual SLAM; graph optimization; object tracking; spatio-temporal similarity; vehicle localization;

D O I：

10.1109/TIV.2024.3415006

中图分类号：

学科分类号：

摘要：

Visual Simultaneous Localization and Mapping (SLAM) is a critical technique for intelligent vehicles and autonomous driving. Most SLAM systems assume the environment to be static or treat dynamic features as outliers for better localization performance. However, clear information about dynamic objects is crucial for localization and decision-making in complex environments. This paper proposes STS-SLAM, a tightly-coupled visual simultaneous localization and multi-object tracking system in dynamic scenes. It can synchronously optimize the motion of the ego-vehicle and objects and estimate object velocity without any prior information about the object. To accurately cluster features, we design a feature metric based on spatio-temporal similarity (STS), which considers the intrinsic properties and current state of the features. All static features are employed for ego-vehicle localization, and features with high STS on dynamic objects are robustly tracked. Furthermore, we propose an adaptive scaling covariance kernel (ASCK) algorithm based on STS to deal with perceptual noise and outliers, which avoids manual optimization of kernel parameters. The STS-SLAM problem is modeled as a dynamic constraint factor graph for joint optimization of dynamic and static structures. Finally, evaluation results on the KITTI tracking dataset, Oxford multi-motion dataset, and real-world scenarios show that the proposed algorithm obtains higher localization accuracy and smaller tracking errors than other state-of-the-art SLAM algorithms. © 2016 IEEE.

引用

页码：494 / 508

页数：14

共 41 条

[21]

Bescos B., Campos C., Tardos J.D., Neira J., DynaSLAM II: Tightly-coupled multi-object tracking and SLAM, IEEE Robot. Autom., 6, 3, pp. 5191-5198, (2021)

[22]

Zhang J., Et al., A dynamic detection and data association method based on probabilistic models for visual SLAM, Displays, 82, (2024)

[23]

Zhang Z., Parameter estimation techniques: A tutorial with application to conic fitting, Image Vis. Comput., 15, 1, pp. 59-76, (1997)

[24]

Barron J.T., A general and adaptive robust loss function, Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 4331-4339, (2019)

[25]

Song B., Yuan X., Ying Z., Yang B., Song Y., Zhou F., DGMVINS: Visual-inertial SLAM for complex dynamic environments with joint geometry feature extraction and multiple object tracking, IEEE Trans. Instrum. Meas., 72, (2023)

[26]

Liu Y., Zhou Z., Optical flow-based stereo visual odometry with dynamic object detection, IEEE Trans. Comput. Social Syst., 10, 6, pp. 3556-3568, (2023)

[27]

He K., Gkioxari G., Dollar P., Girshick R., Mask R-CNN, Proc. IEEE Int. Conf. Comput. Vis., pp. 2961-2969, (2017)

[28]

Kuang B., Yuan J., Liu Q., A robust RGB-D SLAM based on multiple geometric features and semantic segmentation in dynamic environments, Meas. Sci. Technol., 34, 1, (2022)

[29]

Ruan C., Zang Q., Zhang K., Huang K., DN-SLAM: A. visual SLAM with ORB features and NeRF mapping in dynamic environments, IEEE Sens. J., 24, 4, pp. 5279-5287, (2024)

[30]

Zheng Z., Lin S., Yang C., RLD-SLAM: A robust lightweight VI-SLAM for dynamic environments leveraging semantics and motion information, IEEETrans.Ind.Electron., 71, 11, pp. 14328-14338, (2024)

← 1 2 3 4 5 →