PSVMLP: Point and Shifted Voxel MLP for 3D deep learning

被引：2

作者：

Xie, Guanghu ^{[1
]}

Liu, Yang ^{[1
]}

Ji, Yiming ^{[1
]}

Xie, Zongwu ^{[1
]}

Cao, Baoshi ^{[1
]}

机构：

[1] Harbin Inst Technol, State Key Lab Robot & Syst, Harbin 150001, Heilongjiang, Peoples R China

来源：

PATTERN RECOGNITION LETTERS | 2024年 / 185卷

关键词：

Deep learning; Shape part segmentation; Shape classification; Point clouds; CLOUD; NETWORK;

D O I：

10.1016/j.patrec.2024.05.016

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose a high-performance 3D feature extraction deep learning network based on point cloud and shifted voxel, named Point and Shifted Voxel MLP (PSVMLP). The main component of PSVMLP is simple Multi- Layer Perceptron (MLP) structure. PSVMLP achieves effective extraction of multi-scale features from 3D data. Specifically, we combine point cloud and voxel-based feature extraction methods. In voxel representation learning, we propose a wide-range geometric feature extraction method based on axial shifting operations and simple MLP structure. The axial shifting operations allow shifting voxels in the depth, height, and width directions, capturing more geometric information. In point cloud representation learning, we use simple MLP structure to extract local features, and we also extract global features by combining transformer structure. By combining point cloud and voxel feature extraction methods, we obtain rich feature representations from different scales, enhancing the model's expressive power and generalization performance. Applying our designed model to basic geometric feature learning tasks, we achieve excellent results. Despite being built primarily on a simple MLP framework, our model demonstrates remarkable performance on both shape classification and shape part segmentation tasks. Our code is available at https://github.com/hitxraz/psvmlp.

引用

页码：1 / 7

页数：7

共 41 条

[1] DGCNN: A convolutional neural network over large-scale labeled graphs
Anh Viet Phan
Minh Le Nguyen
Yen Lam Hoang Nguyen
Lam Thu Bui
[J]. NEURAL NETWORKS, 2018, 108 : 533 - 543
[2] Atzmon M, 2018, Arxiv, DOI arXiv:1803.10091
[3] Points to Patches: Enabling the Use of Self-Attention for 3D Shape Recognition
Berg, Axel
Oskarsson, Magnus
O'Connor, Mark
[J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 528 - 534
[4] PRA-Net: Point Relation-Aware Network for 3D Point Cloud Analysis
Cheng, Silin
Chen, Xiwu
He, Xinwei
Liu, Zhe
Bai, Xiang
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 4436 - 4448
[5] PointMixer: MLP-Mixer for Point Cloud Understanding
Choe, Jaesung
Park, Chunghyun
Rameau, Francois
Park, Jaesik
Kweon, In So
[J]. COMPUTER VISION - ECCV 2022, PT XXVII, 2022, 13687 : 620 - 640
[6] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
Dai, Angela
Qi, Charles Ruizhongtai
Niessner, Matthias
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6545 - 6554
[7] PCT: Point cloud transformer
Guo, Meng-Hao
Cai, Jun-Xiong
Liu, Zheng-Ning
Mu, Tai-Jiang
Martin, Ralph R.
Hu, Shi-Min
[J]. COMPUTATIONAL VISUAL MEDIA, 2021, 7 (02) : 187 - 199
[8] MVTN: Multi-View Transformation Network for 3D Shape Recognition
Hamdi, Abdullah
Giancola, Silvio
Ghanem, Bernard
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1 - 11
[9] DRNet: Segmentation and localization of optic disc and Fovea from diabetic retinopathy image
Hasan, Md. Kamrul
Alam, Md. Ashraful
Elahi, Md. Toufick E.
Roy, Shidhartho
Marti, Robert
[J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 111
[10] Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation
Hou, Yuenan
Zhu, Xinge
Ma, Yuexin
Loy, Chen Change
Li, Yikang
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8469 - 8478

← 1 2 3 4 5 →