YES-SLAM: YOLOv7-enhanced-semantic visual SLAM for mobile robots in dynamic scenes

被引:6
作者
Liu, Hang [1 ]
Luo, Jingwen [1 ,2 ]
机构
[1] Yunnan Normal Univ, Sch Informat Sci & Technol, 768 Juxian St, Kunming 650500, Yunnan, Peoples R China
[2] Engn Res Ctr Comp Vis & Intelligent Control Techno, Dept Educ Yunnan Prov, Kunming, Yunnan, Peoples R China
关键词
dynamic scenes; simultaneous localization and mapping (SLAM); YOLOv7; depth camera; loop closure detection; 3D semantic map;
D O I
10.1088/1361-6501/ad14e7
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In dynamic scenes, moving objects will cause a significant error accumulation in robot's pose estimation, and might even lead to tracking loss. In view of these problems, this paper proposes a semantic visual simultaneous localization and mapping algorithm based on YOLOv7. First, a light-weight network YOLOv7 is employed to acquire the semantic information of different objects in the scene, and flood filling and edge-enhanced techniques are combined to accurately and quickly separate the dynamic feature points from the extracted feature point set. In this way, the obtained static feature points with high-confidence are used to achieve the accurate estimation of robot's pose. Then, according to the semantic information of YOLOv7, the motion magnitude of the robot, and the number of dynamic feature points in camera's field-of-view, a high-performance keyframe selection strategy is constructed. On this basis, a robust loop closure detection method is developed by introducing the semantic information into the bag-of-words model, and global bundle adjustment optimization is performed on all keyframes and map points to obtain a global consistent pose graph. Finally, YOLOv7 is further utilized to carry out semantic segmentation on the keyframes, remove the dynamic objects in its semantic mask, and combine the point cloud pre-processing and octree map to build a 3D navigation semantic map. A series of simulations on TUM dataset and a case study in real scene clearly demonstrated the performance superiority of the proposed algorithms.
引用
收藏
页数:19
相关论文
共 36 条
  • [1] DDL-SLAM: A Robust RGB-D SLAM in Dynamic Environments Combined With Deep Learning
    Ai, Yongbao
    Rui, Ting
    Lu, Ming
    Fu, Lei
    Liu, Shuai
    Wang, Song
    [J]. IEEE ACCESS, 2020, 8 : 162335 - 162342
  • [2] LEAST-SQUARES FITTING OF 2 3-D POINT SETS
    ARUN, KS
    HUANG, TS
    BLOSTEIN, SD
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1987, 9 (05) : 699 - 700
  • [3] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
    Badrinarayanan, Vijay
    Kendall, Alex
    Cipolla, Roberto
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2481 - 2495
  • [4] DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes
    Bescos, Berta
    Facil, Jose M.
    Civera, Javier
    Neira, Jose
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (04): : 4076 - 4083
  • [5] PCANet: A Simple Deep Learning Baseline for Image Classification?
    Chan, Tsung-Han
    Jia, Kui
    Gao, Shenghua
    Lu, Jiwen
    Zeng, Zinan
    Ma, Yi
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) : 5017 - 5032
  • [6] RANDOM SAMPLE CONSENSUS - A PARADIGM FOR MODEL-FITTING WITH APPLICATIONS TO IMAGE-ANALYSIS AND AUTOMATED CARTOGRAPHY
    FISCHLER, MA
    BOLLES, RC
    [J]. COMMUNICATIONS OF THE ACM, 1981, 24 (06) : 381 - 395
  • [7] Howard AG, 2017, Arxiv, DOI arXiv:1704.04861
  • [8] Grupp M., 2017, Evo: Python Package for the Evaluation of Odometry and SLAM
  • [9] Guo J., 2021, Photon. Laser, V32, P628, DOI [10.16136/j.joel.2021.06.0392, DOI 10.16136/J.JOEL.2021.06.0392]
  • [10] He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]