Patchlpr: a multi-level feature fusion transformer network for LiDAR-based place recognition

被引:0
作者
Sun, Yang [1 ,2 ]
Guo, Jianhua [1 ,3 ]
Wang, Haiyang [4 ]
Zhang, Yuhang [1 ,3 ]
Zheng, Jiushuai [1 ,3 ]
Tian, Bin [5 ]
机构
[1] Hebei Univ Engn, Coll Mech & Equipment Engn, Handan 056038, Peoples R China
[2] Key Lab Intelligent Ind Equipment Technol Hebei Pr, Handan, Hebei, Peoples R China
[3] Handan Key Lab Intelligent Vehicles, Handan, Hebei, Peoples R China
[4] Jizhong Energy Fengfeng Grp Co Ltd, 16 Unicom South Rd, Handan, Hebei, Peoples R China
[5] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
关键词
SLAM; LiDAR Place recognition; Deep learning; Patch; VISION; DEEP;
D O I
10.1007/s11760-024-03138-9
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
LiDAR-based place recognition plays a crucial role in autonomous vehicles, enabling the identification of locations in GPS-invalid environments that were previously accessed. Localization in place recognition can be achieved by searching for nearest neighbors in the database. Two common types of place recognition features are local descriptors and global descriptors. Local descriptors typically compactly represent regions or points, while global descriptors provide an overarching view of the data. Despite the significant progress made in recent years by both types of descriptors, any representation inevitably involves information loss. To overcome this limitation, we have developed PatchLPR, a Transformer network employing multi-level feature fusion for robust place recognition. PatchLPR integrates global and local feature information, focusing on meaningful regions on the feature map to generate an environmental representation. We propose a patch feature extraction module based on the Vision Transformer to fully leverage the information and correlations of different features. We evaluated our approach on the KITTI dataset and a self-collected dataset covering over 4.2 km. The experimental results demonstrate that our method effectively utilizes multi-level features to enhance place recognition performance.
引用
收藏
页码:157 / 165
页数:9
相关论文
共 32 条
  • [1] Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/CVPR.2016.572, 10.1109/TPAMI.2017.2711011]
  • [2] Bingyi Cao, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12365), P726, DOI 10.1007/978-3-030-58565-5_43
  • [3] An Empirical Study of Training Self-Supervised Vision Transformers
    Chen, Xinlei
    Xie, Saining
    He, Kaiming
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9620 - 9629
  • [4] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
    Dai, Angela
    Qi, Charles Ruizhongtai
    Niessner, Matthias
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6545 - 6554
  • [5] Dosovitskiy A., 2020, ARXIV, DOI DOI 10.48550/ARXIV.2010.11929
  • [6] Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
  • [7] Local Descriptor for Robust Place Recognition Using LiDAR Intensity
    Guo, Jiadong
    Borges, Paulo V. K.
    Park, Chanoh
    Gawel, Abel
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (02) : 1470 - 1477
  • [8] Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition
    Hausler, Stephen
    Garg, Sourav
    Xu, Ming
    Milford, Michael
    Fischer, Tobias
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14136 - 14147
  • [9] Billion-Scale Similarity Search with GPUs
    Johnson, Jeff
    Douze, Matthijs
    Jegou, Herve
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2021, 7 (03) : 535 - 547
  • [10] Kim G, 2018, IEEE INT C INT ROBOT, P4802, DOI 10.1109/IROS.2018.8593953