Incorporating learnt local and global embeddings into monocular visual SLAM

被引:3
作者
Huang, Huaiyang [1 ]
Ye, Haoyang [2 ]
Sun, Yuxiang [3 ]
Wang, Lujia [1 ]
Liu, Ming [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Hong Kong, Peoples R China
[2] Huawei Technol, IAS BU Smart Driving Product Dept, Shanghai, Peoples R China
[3] Hong Kong Polytech Univ, Dept Mech Engn, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Visual simultaneous localization and mapping (SLAM); Visual-based navigation; Mapping; ODOMETRY; RECOGNITION; TRACKING;
D O I
10.1007/s10514-021-10007-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional approaches for Visual Simultaneous Localization and Mapping (VSLAM) rely on low-level vision information for state estimation, such as handcrafted local features or the image gradient. While significant progress has been made through this track, under more challenging configuration for monocular VSLAM, e.g., varying illumination, the performance of state-of-the-art systems generally degrades. As a consequence, robustness and accuracy for monocular VSLAM are still widely concerned. This paper presents a monocular VSLAM system that fully exploits learnt features for better state estimation. The proposed system leverages both learnt local features and global embeddings at different modules of the system: direct camera pose estimation, inter-frame feature association, and loop closure detection. With a probabilistic explanation of keypoint prediction, we formulate the camera pose tracking in a direct manner and parameterize local features with uncertainty taken into account. To alleviate the quantization effect, we adapt the mapping module to generate 3D landmarks better to guarantee the system's robustness. Detecting temporal loop closure via deep global embeddings further improves the robustness and accuracy of the proposed system. The proposed system is extensively evaluated on public datasets (Tsukuba, EuRoC, and KITTI), and compared against the state-of-the-art methods. The competitive performance of camera pose estimation confirms the effectiveness of our method.
引用
收藏
页码:789 / 803
页数:15
相关论文
共 54 条
  • [1] Direct Visual Odometry in Low Light Using Binary Descriptors
    Alismail, Hatem
    Kaess, Michael
    Browning, Brett
    Lucey, Simon
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2017, 2 (02): : 444 - 451
  • [2] Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/TPAMI.2017.2711011, 10.1109/CVPR.2016.572]
  • [3] HPatches: A benchmark and evaluation of handcrafted and learned local descriptors
    Balntas, Vassileios
    Lenc, Karel
    Vedaldi, Andrea
    Mikolajczyk, Krystian
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3852 - 3861
  • [4] SURF: Speeded up robust features
    Bay, Herbert
    Tuytelaars, Tinne
    Van Gool, Luc
    [J]. COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, 2006, 3951 : 404 - 417
  • [5] Brooks MJ, 2001, EIGHTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOL I, PROCEEDINGS, P302, DOI 10.1109/ICCV.2001.937533
  • [6] The EuRoC micro aerial vehicle datasets
    Burri, Michael
    Nikolic, Janosch
    Gohl, Pascal
    Schneider, Thomas
    Rehder, Joern
    Omari, Sammy
    Achtelik, Markus W.
    Siegwart, Roland
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2016, 35 (10) : 1157 - 1163
  • [7] Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age
    Cadena, Cesar
    Carlone, Luca
    Carrillo, Henry
    Latif, Yasir
    Scaramuzza, Davide
    Neira, Jose
    Reid, Ian
    Leonard, John J.
    [J]. IEEE TRANSACTIONS ON ROBOTICS, 2016, 32 (06) : 1309 - 1332
  • [8] Darrell T., 2015, P IEEE C COMP VIS PA, P3431, DOI 10.1109/CVPR.2015.7298965
  • [9] DeTone D., 2018, ARXIV PREPRINT ARXIV
  • [10] SuperPoint: Self-Supervised Interest Point Detection and Description
    DeTone, Daniel
    Malisiewicz, Tomasz
    Rabinovich, Andrew
    [J]. PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 337 - 349