PSVMLP: Point and Shifted Voxel MLP for 3D deep learning

被引:2
作者
Xie, Guanghu [1 ]
Liu, Yang [1 ]
Ji, Yiming [1 ]
Xie, Zongwu [1 ]
Cao, Baoshi [1 ]
机构
[1] Harbin Inst Technol, State Key Lab Robot & Syst, Harbin 150001, Heilongjiang, Peoples R China
关键词
Deep learning; Shape part segmentation; Shape classification; Point clouds; CLOUD; NETWORK;
D O I
10.1016/j.patrec.2024.05.016
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a high-performance 3D feature extraction deep learning network based on point cloud and shifted voxel, named Point and Shifted Voxel MLP (PSVMLP). The main component of PSVMLP is simple Multi- Layer Perceptron (MLP) structure. PSVMLP achieves effective extraction of multi-scale features from 3D data. Specifically, we combine point cloud and voxel-based feature extraction methods. In voxel representation learning, we propose a wide-range geometric feature extraction method based on axial shifting operations and simple MLP structure. The axial shifting operations allow shifting voxels in the depth, height, and width directions, capturing more geometric information. In point cloud representation learning, we use simple MLP structure to extract local features, and we also extract global features by combining transformer structure. By combining point cloud and voxel feature extraction methods, we obtain rich feature representations from different scales, enhancing the model's expressive power and generalization performance. Applying our designed model to basic geometric feature learning tasks, we achieve excellent results. Despite being built primarily on a simple MLP framework, our model demonstrates remarkable performance on both shape classification and shape part segmentation tasks. Our code is available at https://github.com/hitxraz/psvmlp.
引用
收藏
页码:1 / 7
页数:7
相关论文
共 41 条
  • [1] DGCNN: A convolutional neural network over large-scale labeled graphs
    Anh Viet Phan
    Minh Le Nguyen
    Yen Lam Hoang Nguyen
    Lam Thu Bui
    [J]. NEURAL NETWORKS, 2018, 108 : 533 - 543
  • [2] Atzmon M, 2018, Arxiv, DOI arXiv:1803.10091
  • [3] Points to Patches: Enabling the Use of Self-Attention for 3D Shape Recognition
    Berg, Axel
    Oskarsson, Magnus
    O'Connor, Mark
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 528 - 534
  • [4] PRA-Net: Point Relation-Aware Network for 3D Point Cloud Analysis
    Cheng, Silin
    Chen, Xiwu
    He, Xinwei
    Liu, Zhe
    Bai, Xiang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 4436 - 4448
  • [5] PointMixer: MLP-Mixer for Point Cloud Understanding
    Choe, Jaesung
    Park, Chunghyun
    Rameau, Francois
    Park, Jaesik
    Kweon, In So
    [J]. COMPUTER VISION - ECCV 2022, PT XXVII, 2022, 13687 : 620 - 640
  • [6] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
    Dai, Angela
    Qi, Charles Ruizhongtai
    Niessner, Matthias
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6545 - 6554
  • [7] PCT: Point cloud transformer
    Guo, Meng-Hao
    Cai, Jun-Xiong
    Liu, Zheng-Ning
    Mu, Tai-Jiang
    Martin, Ralph R.
    Hu, Shi-Min
    [J]. COMPUTATIONAL VISUAL MEDIA, 2021, 7 (02) : 187 - 199
  • [8] MVTN: Multi-View Transformation Network for 3D Shape Recognition
    Hamdi, Abdullah
    Giancola, Silvio
    Ghanem, Bernard
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1 - 11
  • [9] DRNet: Segmentation and localization of optic disc and Fovea from diabetic retinopathy image
    Hasan, Md. Kamrul
    Alam, Md. Ashraful
    Elahi, Md. Toufick E.
    Roy, Shidhartho
    Marti, Robert
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 111
  • [10] Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation
    Hou, Yuenan
    Zhu, Xinge
    Ma, Yuexin
    Loy, Chen Change
    Li, Yikang
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8469 - 8478