Patchlpr: a multi-level feature fusion transformer network for LiDAR-based place recognition

被引：0

作者：

Sun, Yang ^{[1
,2
]}

Guo, Jianhua ^{[1
,3
]}

Wang, Haiyang ^{[4
]}

Zhang, Yuhang ^{[1
,3
]}

Zheng, Jiushuai ^{[1
,3
]}

Tian, Bin ^{[5
]}

机构：

[1] Hebei Univ Engn, Coll Mech & Equipment Engn, Handan 056038, Peoples R China

[2] Key Lab Intelligent Ind Equipment Technol Hebei Pr, Handan, Hebei, Peoples R China

[3] Handan Key Lab Intelligent Vehicles, Handan, Hebei, Peoples R China

[4] Jizhong Energy Fengfeng Grp Co Ltd, 16 Unicom South Rd, Handan, Hebei, Peoples R China

[5] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2024年 / 18卷 / SUPPL 1期

关键词：

SLAM; LiDAR Place recognition; Deep learning; Patch; VISION; DEEP;

D O I：

10.1007/s11760-024-03138-9

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

LiDAR-based place recognition plays a crucial role in autonomous vehicles, enabling the identification of locations in GPS-invalid environments that were previously accessed. Localization in place recognition can be achieved by searching for nearest neighbors in the database. Two common types of place recognition features are local descriptors and global descriptors. Local descriptors typically compactly represent regions or points, while global descriptors provide an overarching view of the data. Despite the significant progress made in recent years by both types of descriptors, any representation inevitably involves information loss. To overcome this limitation, we have developed PatchLPR, a Transformer network employing multi-level feature fusion for robust place recognition. PatchLPR integrates global and local feature information, focusing on meaningful regions on the feature map to generate an environmental representation. We propose a patch feature extraction module based on the Vision Transformer to fully leverage the information and correlations of different features. We evaluated our approach on the KITTI dataset and a self-collected dataset covering over 4.2 km. The experimental results demonstrate that our method effectively utilizes multi-level features to enhance place recognition performance.

引用

页码：157 / 165

页数：9

共 32 条

[1] Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/CVPR.2016.572, 10.1109/TPAMI.2017.2711011]
[2] Bingyi Cao, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12365), P726, DOI 10.1007/978-3-030-58565-5_43
[3] An Empirical Study of Training Self-Supervised Vision Transformers
Chen, Xinlei
Xie, Saining
He, Kaiming
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9620 - 9629
[4] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
Dai, Angela
Qi, Charles Ruizhongtai
Niessner, Matthias
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6545 - 6554
[5] Dosovitskiy A., 2020, ARXIV, DOI DOI 10.48550/ARXIV.2010.11929
[6] Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[7] Local Descriptor for Robust Place Recognition Using LiDAR Intensity
Guo, Jiadong
Borges, Paulo V. K.
Park, Chanoh
Gawel, Abel
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (02) : 1470 - 1477
[8] Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition
Hausler, Stephen
Garg, Sourav
Xu, Ming
Milford, Michael
Fischer, Tobias
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14136 - 14147
[9] Billion-Scale Similarity Search with GPUs
Johnson, Jeff
Douze, Matthijs
Jegou, Herve
[J]. IEEE TRANSACTIONS ON BIG DATA, 2021, 7 (03) : 535 - 547
[10] Kim G, 2018, IEEE INT C INT ROBOT, P4802, DOI 10.1109/IROS.2018.8593953

← 1 2 3 4 →