Learning point cloud context information based on 3D transformer for more accurate and efficient classification

被引：4

作者：

Chen, Yiping ^{[1
]}

Zhang, Shuai ^{[1
,3
]}

Lin, Weisheng ^{[2
]}

Zhang, Shuhang ^{[1
,3
]}

Zhang, Wuming ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Sch Geospatial Engn & Sci, Zhuhai, Peoples R China

[2] Xiamen Univ, Fujian Key Lab Sensing & Comp Smart Cities, Xiamen, Peoples R China

[3] Sun Yat Sen Univ, Sch Geospatial Engn & Sci, Zhuhai 519082, Peoples R China

来源：

PHOTOGRAMMETRIC RECORD | 2023年 / 38卷 / 184期

基金：

中国国家自然科学基金;

关键词：

classification; context information; point cloud; 3D transformer;

D O I：

10.1111/phor.12469

中图分类号：

P9 [自然地理学];

学科分类号：

0705 ; 070501 ;

摘要：

The point cloud semantic understanding task has made remarkable progress along with the development of 3D deep learning. However, aggregating spatial information to improve the local feature learning capability of the network remains a major challenge. Many methods have been used for improving local information learning, such as constructing a multi-area structure for capturing different area information. However, it will lose some local information due to the independent learning point feature. To solve this problem, a new network is proposed that considers the importance of the differences between points in the neighbourhood. Capturing local feature information can be enhanced by highlighting the different feature importance of the point cloud in the neighbourhood. First, T-Net is constructed to learn the point cloud transformation matrix for point cloud disorder. Second, transformer is used to improve the problem of local information loss due to the independence of each point in the neighbourhood. The experimental results show that 92.2% accuracy overall was achieved on the ModelNet40 dataset and 93.8% accuracy overall was achieved on the ModelNet10 dataset. The figure shows the pipeline of point cloud classification which is similar to PointNet. T-Net is used to eliminate the effect of point cloud rotation and a 3D transformer module is utilised to learn the point cloud context information. Finally, the MLP is utilised to map to the category dimension. Experiments show that our method is accurate and efficient.image

引用

页码：603 / 616

页数：14

共 32 条

[21]

Qi C.R., 2017, Advances in Neural Information Processing Systems

[22] Volumetric and Multi-View CNNs for Object Classification on 3D Data [J].

Qi, Charles R. ;

Su, Hao ;

Niessner, Matthias ;

Dai, Angela ;

Yan, Mengyuan ;

Guibas, Leonidas J. .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :5648-5656

[23] OctNet: Learning Deep 3D Representations at High Resolutions [J].

Riegler, Gernot ;

Ulusoy, Ali Osman ;

Geiger, Andreas .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6620-6629

[24] Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling [J].

Shen, Yiru ;

Feng, Chen ;

Yang, Yaoqing ;

Tian, Dong .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4548-4557

[25] Dynamic Local Geometry Capture in 3D Point Cloud Classification [J].

Sheshappanavar, Shivanand Venkanna ;

Kambhamettu, Chandra .

2021 IEEE 4TH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL, MIPR, 2021, :158-164

[26] Deep Learning 3D Shape Surfaces Using Geometry Images [J].

Sinha, Ayan ;

Bai, Jing ;

Ramani, Karthik .

COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 :223-240

[27] Multi-view Convolutional Neural Networks for 3D Shape Recognition [J].

Su, Hang ;

Maji, Subhransu ;

Kalogerakis, Evangelos ;

Learned-Miller, Erik .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :945-953

[28]

Vaswani A., 2017, Advances in Neural Information Processing Systems, V30, P5998, DOI DOI 10.48550/ARXIV.1706.03762

[29] Dynamic Graph CNN for Learning on Point Clouds [J].

Wang, Yue ;

Sun, Yongbin ;

Liu, Ziwei ;

Sarma, Sanjay E. ;

Bronstein, Michael M. ;

Solomon, Justin M. .

ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (05)

[30] Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling [J].

Yu, Xumin ;

Tang, Lulu ;

Rao, Yongming ;

Huang, Tiejun ;

Zhou, Jie ;

Lu, Jiwen .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :19291-19300

← 1 2 3 4 →