Pyramid Point Cloud Transformer for Large-Scale Place Recognition

被引:78
作者
Hui, Le [1 ]
Yang, Hang [1 ]
Cheng, Mingmei [1 ]
Xie, Jin [1 ]
Yang, Jian [1 ]
机构
[1] Nanjing Univ Sci & Technol, PCA Lab, Key Lab Intelligent Percept & Syst High Dimens In, Minist Educ, Nanjing, Jiangsu, Peoples R China
来源
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年
关键词
SIMULTANEOUS LOCALIZATION; SLAM; HISTOGRAMS; ROBUST;
D O I
10.1109/ICCV48922.2021.00604
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, deep learning based point cloud descriptors have achieved impressive results in the place recognition task. Nonetheless, due to the sparsity of point clouds, how to extract discriminative local features of point clouds to efficiently form a global descriptor is still a challenging problem. In this paper, we propose a pyramid point cloud transformer network (PPT-Net) to learn the discriminative global descriptors from point clouds for efficient retrieval. Specifically, we first develop a pyramid point transformer module that adaptively learns the spatial relationship of the different k-NN neighboring points of point clouds, where the grouped self-attention is proposed to extract discriminative local features of the point clouds. The grouped self-attention not only enhances long-term dependencies of the point clouds, but also reduces the computational cost. In order to obtain discriminative global descriptors, we construct a pyramid VLAD module to aggregate the multi-scale feature maps of point clouds into the global descriptors. By applying VLAD pooling on multi-scale feature maps, we utilize the context gating mechanism on the multiple global descriptors to adaptively weight the multi-scale global context information into the final global descriptor. Experimental results on the Oxford dataset and three in-house datasets show that our method achieves the state-of-the-art on the point cloud based place recognition task.
引用
收藏
页码:6078 / 6087
页数:10
相关论文
共 58 条
  • [1] [Anonymous], 2016 IEEE C COMPUTER, DOI DOI 10.1109/CVPR.2016.609
  • [2] [Anonymous], 2017, CVPR, DOI DOI 10.1109/CVPR.2017.701
  • [3] [Anonymous], 2017, ACTA POLYTECHNICA HU
  • [4] [Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00319
  • [5] [Anonymous], 2018, CVPR, DOI DOI 10.1109/CVPR.2018.00470
  • [6] [Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00336
  • [7] Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/TPAMI.2017.2711011, 10.1109/CVPR.2016.572]
  • [8] Ba LJ, 2015, 2015 IEEE International Conference on Applied Superconductivity and Electromagnetic Devices (ASEMD), P3, DOI 10.1109/ASEMD.2015.7453438
  • [9] Simultaneous localization and mapping (SLAM): Part II
    Bailey, Tim
    Durrant-Whyte, Hugh
    [J]. IEEE ROBOTICS & AUTOMATION MAGAZINE, 2006, 13 (03) : 108 - 117
  • [10] Robust Place Recognition and Loop Closing in Laser-Based SLAM for UGVs in Urban Environments
    Cao, Fengkui
    Zhuang, Yan
    Zhang, Hong
    Wang, Wei
    [J]. IEEE SENSORS JOURNAL, 2018, 18 (10) : 4242 - 4252