Attention-SLAM: A Visual Monocular SLAM Learning From Human Gaze

被引:34
作者
Li, Jinquan [1 ]
Pei, Ling [1 ]
Zou, Danping [1 ]
Xia, Songpengcheng [1 ]
Wu, Qi [1 ]
Li, Tao [1 ]
Sun, Zhen [1 ]
Yu, Wenxian [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai Key Lab Nav & Locat Based Serv, Shanghai 200240, Peoples R China
关键词
Simultaneous localization and mapping; Visualization; Semantics; Feature extraction; Data mining; Predictive models; Adaptation models; Visual sailency; monocular visual semantic SLAM; weighted bundle adjustment; SIMULTANEOUS LOCALIZATION; ODOMETRY;
D O I
10.1109/JSEN.2020.3038432
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a novel simultaneous localization and mapping (SLAM) approach, namely Attention-SLAM, which simulates human navigation mode by combining a visual saliency model (SalNavNet) with traditional monocular visual SLAM. Firstly a visual saliency model namely SalNavNet is proposed. In SalNavNet, we introduce a correlation module and propose an adaptive Exponential Moving Average (EMA) module. These modules mitigate the center bias, which most current saliency models have. This novel idea enables the saliency maps generated by SalNavNet to pay more attention to the same salient object. An open-source saliency SLAM dataset namely Salient-Euroc is published, it consists of Euroc dataset and corresponding saliency maps. Moreover, we propose a new optimization method called Weighted Bundle Adjustment (Weighted BA) in Attention-SLAM. Most SLAM methods treat all the features extracted from the images as equal importance during the optimization process. In weighted BA, the feature points extracted from the salient regions have greater importance. Comprehensive test results prove that Attention-SLAM outperforms benchmarks such as Direct Sparse Odometry (DSO), ORB-SLAM, and Salient DSO in the 7 of 11 test cases. The test cases are all indoor scenes, with varying brightness, speed, and image distortion. Compared with ORB-SLAM, our method improves the accuracy by 4% and efficiency by 6.5% on average.
引用
收藏
页码:6408 / 6420
页数:13
相关论文
共 54 条
  • [1] [Anonymous], 2016, ARXIV161109571
  • [2] Long short-term memory
    Hochreiter, S
    Schmidhuber, J
    [J]. NEURAL COMPUTATION, 1997, 9 (08) : 1735 - 1780
  • [3] Visual 3-D SLAM from UAVs
    Artieda, Jorge
    Sebastian, Jose M.
    Campoy, Pascual
    Correa, Juan F.
    Mondragon, Ivan F.
    Martinez, Carol
    Olivares, Miguel
    [J]. JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2009, 55 (4-5) : 299 - 321
  • [4] Spatio-Temporal Saliency Networks for Dynamic Saliency Prediction
    Bak, Cagdas
    Kocak, Aysun
    Erdem, Erkut
    Erdem, Aykut
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (07) : 1688 - 1698
  • [5] SURF: Speeded up robust features
    Bay, Herbert
    Tuytelaars, Tinne
    Van Gool, Luc
    [J]. COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, 2006, 3951 : 404 - 417
  • [6] DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes
    Bescos, Berta
    Facil, Jose M.
    Civera, Javier
    Neira, Jose
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (04): : 4076 - 4083
  • [7] Saliency Prediction in the Deep Learning Era: Successes and Limitations
    Borji, Ali
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (02) : 679 - 700
  • [8] Brasch N, 2018, IEEE INT C INT ROBOT, P393, DOI 10.1109/IROS.2018.8593828
  • [9] Exploiting Symmetries to Design EKFs With Consistency Properties for Navigation and SLAM
    Brossard, Martin
    Barran, Axel
    Bonnabel, Silvere
    [J]. IEEE SENSORS JOURNAL, 2019, 19 (04) : 1572 - 1579
  • [10] The EuRoC micro aerial vehicle datasets
    Burri, Michael
    Nikolic, Janosch
    Gohl, Pascal
    Schneider, Thomas
    Rehder, Joern
    Omari, Sammy
    Achtelik, Markus W.
    Siegwart, Roland
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2016, 35 (10) : 1157 - 1163