USD-SLAM: A Universal Visual SLAM Based on Large Segmentation Model in Dynamic Environments

被引:2
作者
Wang, Jingwei [1 ]
Ren, Yizhang [2 ]
Li, Zhiwei [1 ]
Xie, Xiaoming [1 ]
Chen, Zilong [3 ]
Shen, Tianyu [1 ]
Liu, Huaping [3 ]
Wang, Kunfeng [1 ]
机构
[1] Beijing Univ Chem Technol, Coll Informat Sci & Technol, Beijing 100029, Peoples R China
[2] Beihang Univ, Coll Instrumental Sci & Optoelect Engn, Beijing 100191, Peoples R China
[3] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
来源
IEEE ROBOTICS AND AUTOMATION LETTERS | 2024年 / 9卷 / 12期
基金
中国国家自然科学基金;
关键词
Image segmentation; Simultaneous localization and mapping; Dynamics; Three-dimensional displays; Motion segmentation; Semantics; Feature extraction; Vehicle dynamics; Location awareness; Cameras; SLAM; localization; object detection; segmentation and categorization; TRACKING;
D O I
10.1109/LRA.2024.3498781
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Visual Simultaneous Localization and Mapping (SLAM) has been widely adopted in autonomous driving and robotics. While most SLAM systems operate effectively in static or low-dynamic environments, achieving precise pose estimation in diverse unknown dynamic environments continues to pose a significant challenge. This letter introduces an advanced universal visual SLAM system (USD-SLAM) that combines a universal large segmentation model with a 3D spatial motion state constraint module to accurately handle any dynamic objects present in the environment. Our system first employs a large segmentation model guided by precise prompts to identify movable regions accurately. Based on the identified movable object regions, 3D spatial motion state constraints are exploited to remove the moving object regions. Finally, the moving object regions are excluded for subsequent tracking, localization, and mapping, ensuring stable and high-precision pose estimation. Experimental results demonstrate that our method can robustly operate in various dynamic and static environments without additional training, providing higher localization accuracy compared to other advanced dynamic SLAM systems.
引用
收藏
页码:11810 / 11817
页数:8
相关论文
共 27 条
  • [21] Wen S., Et al., Dynamic SLAM: A visual SLAM in outdoor dynamic scenes, IEEE Trans. Instrum. Meas., 72, (2023)
  • [22] Karaoglu M.A., Et al., Dynamon: Motion-aware fast and robust camera localization for dynamic neural radiance fields, (2023)
  • [23] Cao A., Johnson J., Hexplane: A fast representation for dynamic scenes, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 130-141, (2023)
  • [24] Sturm J., Engelhard N., Endres F., Burgard W., Cremers D., A benchmark for the evaluation of RGB-D SLAM systems, Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., pp. 573-580, (2012)
  • [25] Palazzolo E., Behley J., Lottes P., Giguere P., Stachniss C., Refusion: 3D reconstruction in dynamic environments for RGB-D cameras exploiting residuals, Proc. 2019 IEEE/RSJ Int.Conf. Intell. Robots Syst., pp. 7855-7862, (2019)
  • [26] Geiger A., Lenz P., Urtasun R., Are we ready for autonomous driving? The kitti vision benchmark suite, Proc. 2012 IEEE Conf. Comput. Vis. Pattern Recognit., pp. 3354-3361, (2012)
  • [27] Du Z.-J., Huang S.-S., Mu T.-J., Zhao Q., Martin R.R., Xu K., Accurate dynamic SLAMusing CRF-based long-term consistency, IEEE Trans. Vis. Comput. Graph., 28, 4, pp. 1745-1757, (2022)