Pyramid Point Cloud Transformer for Large-Scale Place Recognition

被引：77

作者：

Hui, Le ^{[1
]}

Yang, Hang ^{[1
]}

Cheng, Mingmei ^{[1
]}

Xie, Jin ^{[1
]}

Yang, Jian ^{[1
]}

机构：

[1] Nanjing Univ Sci & Technol, PCA Lab, Key Lab Intelligent Percept & Syst High Dimens In, Minist Educ, Nanjing, Jiangsu, Peoples R China

来源：

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年

关键词：

SIMULTANEOUS LOCALIZATION; SLAM; HISTOGRAMS; ROBUST;

D O I：

10.1109/ICCV48922.2021.00604

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, deep learning based point cloud descriptors have achieved impressive results in the place recognition task. Nonetheless, due to the sparsity of point clouds, how to extract discriminative local features of point clouds to efficiently form a global descriptor is still a challenging problem. In this paper, we propose a pyramid point cloud transformer network (PPT-Net) to learn the discriminative global descriptors from point clouds for efficient retrieval. Specifically, we first develop a pyramid point transformer module that adaptively learns the spatial relationship of the different k-NN neighboring points of point clouds, where the grouped self-attention is proposed to extract discriminative local features of the point clouds. The grouped self-attention not only enhances long-term dependencies of the point clouds, but also reduces the computational cost. In order to obtain discriminative global descriptors, we construct a pyramid VLAD module to aggregate the multi-scale feature maps of point clouds into the global descriptors. By applying VLAD pooling on multi-scale feature maps, we utilize the context gating mechanism on the multiple global descriptors to adaptively weight the multi-scale global context information into the final global descriptor. Experimental results on the Oxford dataset and three in-house datasets show that our method achieves the state-of-the-art on the point cloud based place recognition task.

引用

页码：6078 / 6087

页数：10

共 50 条

[31] MCTNet: Multiscale Cross-Attention-Based Transformer Network for Semantic Segmentation of Large-Scale Point Cloud
Guo, Bo
Deng, Liwei
Wang, Ruisheng
Guo, Wenchao
Ng, Alex Hay-Man
Bai, Wenfeng
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[32] A lightweight Transformer-based neural network for large-scale masonry arch bridge point cloud segmentation
Jing, Yixiong
Sheil, Brian
Acikgoz, Sinan
COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2024, 39 (16) : 2427 - 2438
[33] Weakly Supervised Semantic Segmentation for Large-Scale Point Cloud
Zhang, Yachao
Li, Zonghao
Xie, Yuan
Qu, Yanyun
Li, Cuihua
Mei, Tao
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 3421 - 3429
[34] Large-scale Point Cloud Semantic Segmentation with Superpoint Graphs
Landrieu, Loic
Simonovsky, Martin
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4558 - 4567
[35] Cylinder Detection in Large-Scale Point Cloud of Pipeline Plant
Liu, Yong-Jin
Zhang, Jun-Bin
Hou, Ji-Chun
Ren, Ji-Cheng
Tang, Wei-Qing
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2013, 19 (10) : 1700 - 1707
[36] Multi-scale Point Octree Encoding Network for Point Cloud based Place Recognition
Tang, Zhilong
Ye, Hanjing
Zhang, Hong
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 9191 - 9197
[37] Regional-to-Local Point-Voxel Transformer for Large-Scale Indoor 3D Point Cloud Semantic Segmentation
Li, Shuai
Li, Hongjun
REMOTE SENSING, 2023, 15 (19)
[38] An Efficient 3-D Point Cloud Place Recognition Approach Based on Feature Point Extraction and Transformer
Ye, Tao
Yan, Xiangming
Wang, Shouan
Li, Yunwang
Zhou, Fuqiang
IEEE Transactions on Instrumentation and Measurement, 2022, 71
[39] An Efficient 3-D Point Cloud Place Recognition Approach Based on Feature Point Extraction and Transformer
Ye, Tao
Yan, Xiangming
Wang, Shouan
Li, Yunwang
Zhou, Fuqiang
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
[40] CSPFormer: A cross-spatial pyramid transformer for visual place recognition
Li, Zhenyu
Xu, Pengjie
NEUROCOMPUTING, 2024, 580

← 1 2 3 4 5 →