SLM-SLAM: a visual SLAM system based on segmented large-scale model in dynamic scenes and zero-shot conditions

被引:1
作者
Zhu, Fan [1 ,2 ]
Chen, Ziyu [1 ,2 ]
Jiang, Chunmao [1 ,2 ]
Xu, Liwei [2 ]
Zhang, Shijin [2 ]
Yu, Biao [2 ]
Zhu, Hui [2 ]
机构
[1] Univ Sci & Technol China, Hefei 230026, Peoples R China
[2] Hefei Inst Phys Sci, Chinese Acad Sci, Hefei 230031, Peoples R China
关键词
visual SLAM; semantic segmentation; dynamic scenes; large-scale model; motion consistency detection;
D O I
10.1088/1361-6501/ad4ab6
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In practical applications, the existence of diverse dynamic objects can compromise the localization precision of most conventional Visual Simultaneous Localization and Mapping (VSLAM) systems. Simultaneously, many dynamic VSLAM systems based on neural networks require pre-training for specific application scenarios. We introduce SLM-SLAM, the first VSLAM system that implements zero-shot processing of dynamic scenes. It achieves the capability to handle various dynamic objects without the necessity for pre-training, enabling straightforward adaptation to different application scenarios. Firstly, we designed an open-world semantic segmentation module based on a segmented large-scale model to acquire semantic information in the scene. Subsequently, we devised a label-based strategy for selecting feature points, jointly optimizing poses with the weighted labels provided by both semantic and geometric information. Finally, we refined the keyframe selection strategy of ORB-SLAM3 to prevent matching errors caused by an insufficient number of remaining static feature points in the scene. We conducted experiments on the TUM dataset, the KITTI dataset, and real-world scenarios. The results indicate that in dynamic scenes, our SLM-SLAM significantly improves localization accuracy compared to ORB-SLAM3, and its performance is comparable to state-of-the-art dynamic VSLAM systems.
引用
收藏
页数:12
相关论文
共 39 条
[1]   DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes [J].
Bescos, Berta ;
Facil, Jose M. ;
Civera, Javier ;
Neira, Jose .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (04) :4076-4083
[2]   ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial, and Multimap SLAM [J].
Campos, Carlos ;
Elvira, Richard ;
Gomez Rodriguez, Juan J. ;
Montiel, Jose M. M. ;
Tardos, Juan D. .
IEEE TRANSACTIONS ON ROBOTICS, 2021, 37 (06) :1874-1890
[3]  
Cheng YM, 2023, Arxiv, DOI arXiv:2305.06558
[4]   Direct Sparse Odometry [J].
Engel, Jakob ;
Koltun, Vladlen ;
Cremers, Daniel .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (03) :611-625
[5]   LSD-SLAM: Large-Scale Direct Monocular SLAM [J].
Engel, Jakob ;
Schoeps, Thomas ;
Cremers, Daniel .
COMPUTER VISION - ECCV 2014, PT II, 2014, 8690 :834-849
[6]   Blitz-SLAM: A semantic SLAM in dynamic environments [J].
Fan, Yingchun ;
Zhang, Qichi ;
Tang, Yuliang ;
Liu, Shaofen ;
Han, Hong .
PATTERN RECOGNITION, 2022, 121
[7]   Deep Learning for Visual SLAM: The State-of-the-Art and Future Trends [J].
Favorskaya, Margarita N. .
ELECTRONICS, 2023, 12 (09)
[8]  
Forster C, 2014, IEEE INT CONF ROBOT, P15, DOI 10.1109/ICRA.2014.6906584
[9]   OVD-SLAM: An Online Visual SLAM for Dynamic Environments [J].
He, Jiaming ;
Li, Mingrui ;
Wang, Yangyang ;
Wang, Hongyu .
IEEE SENSORS JOURNAL, 2023, 23 (12) :13210-13219
[10]   A Robust Semi-Direct 3D SLAM for Mobile Robot Based on Dense Optical Flow in Dynamic Scenes [J].
Hu, Bo ;
Luo, Jingwen .
BIOMIMETICS, 2023, 8 (04)