AFO-SLAM: an improved visual SLAM in dynamic scenes using acceleration of feature extraction and object detection

被引:4
作者
Wei, Jinbi [1 ]
Deng, Heng [1 ,2 ]
Wang, Jihong [1 ]
Zhang, Liguo [1 ,2 ]
机构
[1] Beijing Univ Technol, Sch Informat Sci & Technol, Beijing 100124, Peoples R China
[2] Minist Educ, Engn Res Ctr Intelligence Percept & Autonomous Con, Beijing 100124, Peoples R China
基金
中国国家自然科学基金;
关键词
dynamic environments; object detection; depth information; CUDA; visual simultaneous localization and mapping (SLAM); ENVIRONMENTS;
D O I
10.1088/1361-6501/ad6627
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In visual simultaneous localization and mapping (SLAM) systems, traditional methods often excel due to rigid environmental assumptions, but face challenges in dynamic environments. To address this, learning-based approaches have been introduced, but their expensive computing costs hinder real-time performance, especially on embedded mobile platforms. In this article, we propose a robust and real-time visual SLAM method towards dynamic environments using acceleration of feature extraction and object detection (AFO-SLAM). First, AFO-SLAM employs an independent object detection thread that utilizes YOLOv5 to extract semantic information and identify the bounding boxes of moving objects. To preserve the background points within these boxes, depth information is utilized to segment target foreground and background with only a single frame, with the points of the foreground area considered as dynamic points and then rejected. To optimize performance, CUDA program accelerates feature extraction preceding point removal. Finally, extensive evaluations are performed on both TUM RGB-D dataset and real scenes using a low-power embedded platform. Experimental results demonstrate that AFO-SLAM offers a balance between accuracy and real-time performance on embedded platforms, and enables the generation of dense point cloud maps in dynamic scenarios.
引用
收藏
页数:16
相关论文
共 50 条
[1]  
Aldegheri S, 2019, IEEE INT C INT ROBOT, P5370, DOI [10.1109/iros40897.2019.8967814, 10.1109/IROS40897.2019.8967814]
[2]  
[Anonymous], 2020, YOLO v5 Object Detection Tutorial.
[3]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[4]   DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes [J].
Bescos, Berta ;
Facil, Jose M. ;
Civera, Javier ;
Neira, Jose .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (04) :4076-4083
[5]  
Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
[6]   ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial, and Multimap SLAM [J].
Campos, Carlos ;
Elvira, Richard ;
Gomez Rodriguez, Juan J. ;
Montiel, Jose M. M. ;
Tardos, Juan D. .
IEEE TRANSACTIONS ON ROBOTICS, 2021, 37 (06) :1874-1890
[7]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[8]   SG-SLAM: A Real-Time RGB-D Visual SLAM Toward Dynamic Scenes With Semantic and Geometric Information [J].
Cheng, Shuhong ;
Sun, Changhe ;
Zhang, Shijun ;
Zhang, Dianfan .
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[9]   RANDOM SAMPLE CONSENSUS - A PARADIGM FOR MODEL-FITTING WITH APPLICATIONS TO IMAGE-ANALYSIS AND AUTOMATED CARTOGRAPHY [J].
FISCHLER, MA ;
BOLLES, RC .
COMMUNICATIONS OF THE ACM, 1981, 24 (06) :381-395
[10]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448