Multi-Modal Visual Place Recognition in Dynamics-Invariant Perception Space

被引:3
作者
Wu, Lin [1 ]
Wang, Teng [1 ]
Sun, Changyin [1 ]
机构
[1] Southeast Univ, Sch Automat Key Lab Measurement & Control Complex, Nanjing 210096, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Image segmentation; Feature extraction; Loss measurement; Visualization; Image recognition; Image coding; Visual place recognition; multi-modal fusion; dynamics-invariant space; image translation; deep learning;
D O I
10.1109/LSP.2021.3123907
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Visual place recognition is one of the essential and challenging problems in the fields of robotics. In this letter, we for the first time explore the use of multi-modal fusion of semantic and visual modalities in dynamics-invariant space to improve place recognition in dynamic environments. We achieve this by first designing a novel deep learning architecture to generate the static semantic segmentation and recover the static image directly from the corresponding dynamic image. We then innovatively leverage the spatial-pyramid-matching model to encode the static semantic segmentation into feature vectors. In parallel, the static image is encoded using the popular Bag-of-words model. On the basis of the above multi-modal features, we finally measure the similarity between the query image and target landmark by the joint similarity of their semantic and visual codes. Extensive experiments demonstrate the effectiveness and robustness of the proposed approach for place recognition in dynamic environments.
引用
收藏
页码:2197 / 2201
页数:5
相关论文
共 29 条
  • [1] SemanticSLAM: Using Environment Landmarks for Unsupervised Indoor Localization
    Abdelnasser, Heba
    Mohamed, Reham
    Elgohary, Ahmed
    Alzantot, Moustafa Farid
    Wang, He
    Sen, Souvik
    Choudhury, Romit Roy
    Youssef, Moustafa
    [J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2016, 15 (07) : 1770 - 1782
  • [2] Saliency-based multi-feature modeling for semantic image retrieval
    Bai, Cong
    Chen, Jia-nan
    Huang, Ling
    Kpalma, Kidiyo
    Chen, Shengyong
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 50 : 199 - 204
  • [3] Empty Cities: A Dynamic-Object-Invariant Space for Visual SLAM
    Bescos, Berta
    Cadena, Cesar
    Neira, Jose
    [J]. IEEE TRANSACTIONS ON ROBOTICS, 2021, 37 (02) : 433 - 451
  • [4] Bescos B, 2019, IEEE INT CONF ROBOT, P5460, DOI [10.1109/icra.2019.8794417, 10.1109/ICRA.2019.8794417]
  • [5] Learning Matchable Image Transformations for Long-Term Metric Visual Localization
    Clement, Lee
    Gridseth, Mona
    Tomasi, Justin
    Kelly, Jonathan
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02): : 1492 - 1499
  • [6] CamNet: Coarse-to-Fine Retrieval for Camera Re-Localization
    Ding, Mingyu
    Wang, Zhe
    Sun, Jiankai
    Shi, Jianping
    Luo, Ping
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2871 - 2880
  • [7] Dong S., 2021, PROC IEEECVF C COMPU, P8544
  • [8] Direct Sparse Odometry
    Engel, Jakob
    Koltun, Vladlen
    Cremers, Daniel
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (03) : 611 - 625
  • [9] Image-to-Image Translation with Conditional Adversarial Networks
    Isola, Phillip
    Zhu, Jun-Yan
    Zhou, Tinghui
    Efros, Alexei A.
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5967 - 5976
  • [10] Perceptual Losses for Real-Time Style Transfer and Super-Resolution
    Johnson, Justin
    Alahi, Alexandre
    Li Fei-Fei
    [J]. COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 694 - 711