Multi-Modal Visual Place Recognition in Dynamics-Invariant Perception Space

被引：3

作者：

Wu, Lin ^{[1
]}

Wang, Teng ^{[1
]}

Sun, Changyin ^{[1
]}

机构：

[1] Southeast Univ, Sch Automat Key Lab Measurement & Control Complex, Nanjing 210096, Peoples R China

来源：

IEEE SIGNAL PROCESSING LETTERS | 2021年 / 28卷

基金：

中国国家自然科学基金;

关键词：

Semantics; Image segmentation; Feature extraction; Loss measurement; Visualization; Image recognition; Image coding; Visual place recognition; multi-modal fusion; dynamics-invariant space; image translation; deep learning;

D O I：

10.1109/LSP.2021.3123907

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Visual place recognition is one of the essential and challenging problems in the fields of robotics. In this letter, we for the first time explore the use of multi-modal fusion of semantic and visual modalities in dynamics-invariant space to improve place recognition in dynamic environments. We achieve this by first designing a novel deep learning architecture to generate the static semantic segmentation and recover the static image directly from the corresponding dynamic image. We then innovatively leverage the spatial-pyramid-matching model to encode the static semantic segmentation into feature vectors. In parallel, the static image is encoded using the popular Bag-of-words model. On the basis of the above multi-modal features, we finally measure the similarity between the query image and target landmark by the joint similarity of their semantic and visual codes. Extensive experiments demonstrate the effectiveness and robustness of the proposed approach for place recognition in dynamic environments.

引用

页码：2197 / 2201

页数：5

共 29 条

[1] SemanticSLAM: Using Environment Landmarks for Unsupervised Indoor Localization
Abdelnasser, Heba
Mohamed, Reham
Elgohary, Ahmed
Alzantot, Moustafa Farid
Wang, He
Sen, Souvik
Choudhury, Romit Roy
Youssef, Moustafa
[J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2016, 15 (07) : 1770 - 1782
[2] Saliency-based multi-feature modeling for semantic image retrieval
Bai, Cong
Chen, Jia-nan
Huang, Ling
Kpalma, Kidiyo
Chen, Shengyong
[J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 50 : 199 - 204
[3] Empty Cities: A Dynamic-Object-Invariant Space for Visual SLAM
Bescos, Berta
Cadena, Cesar
Neira, Jose
[J]. IEEE TRANSACTIONS ON ROBOTICS, 2021, 37 (02) : 433 - 451
[4] Bescos B, 2019, IEEE INT CONF ROBOT, P5460, DOI [10.1109/icra.2019.8794417, 10.1109/ICRA.2019.8794417]
[5] Learning Matchable Image Transformations for Long-Term Metric Visual Localization
Clement, Lee
Gridseth, Mona
Tomasi, Justin
Kelly, Jonathan
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02): : 1492 - 1499
[6] CamNet: Coarse-to-Fine Retrieval for Camera Re-Localization
Ding, Mingyu
Wang, Zhe
Sun, Jiankai
Shi, Jianping
Luo, Ping
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2871 - 2880
[7] Dong S., 2021, PROC IEEECVF C COMPU, P8544
[8] Direct Sparse Odometry
Engel, Jakob
Koltun, Vladlen
Cremers, Daniel
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (03) : 611 - 625
[9] Image-to-Image Translation with Conditional Adversarial Networks
Isola, Phillip
Zhu, Jun-Yan
Zhou, Tinghui
Efros, Alexei A.
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5967 - 5976
[10] Perceptual Losses for Real-Time Style Transfer and Super-Resolution
Johnson, Justin
Alahi, Alexandre
Li Fei-Fei
[J]. COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 694 - 711

← 1 2 3 →