Point Cloud Semantic Segmentation with Transformer and Multi-Scale Feature Extraction

被引:1
作者
Zhao, Wenqing [1 ]
Liao, Lyuchao [1 ]
Wang, Zhimin [1 ]
Cai, Sijing [1 ]
Liang, Yu [1 ]
机构
[1] Fujian Univ Technol, Sch Transportat, Fuzhou 350108, Peoples R China
基金
中国国家自然科学基金;
关键词
3D point cloud; semantic segmentation; Transformer architecture; multi-scale feature extraction; dilated convolution; deep learning;
D O I
10.3390/electronics14102054
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To address the limitations of existing point cloud semantic segmentation methods in modeling long-range dependencies and adapting to multi-scale targets, we propose an improved Cylinder3D framework based on multi-scale feature extraction. Current voxel-based methods primarily rely on local 3D convolutions, limiting their effectiveness in complex scenes, while fixed-receptive-field convolutions also limit model adaptability to multi-scale targets. To address these challenges, we propose two major improvements: first, we introduce a Transformer module for global modeling in high-level feature extraction, leveraging the self-attention mechanism to enable global information exchange between different voxels, overcoming the limitations of traditional convolution's local receptive fields; second, we design a multi-scale feature extraction architecture with tri-directional dilated convolutions, namely the improved DDCM module, which combines standard convolution and dilated convolution, effectively expanding the receptive field while maintaining parameter efficiency and enhancing the processing capability for point clouds with heterogeneous density distributions. Evaluation on the SemanticKITTI dataset demonstrates that our improved model achieves an mIoU of 66.8%, a 2.4% increase over the original Cylinder3D. Our research indicates that the proposed enhancements significantly improve the accuracy of 3D point cloud semantic segmentation, though the computational complexity of the model requires further optimization, which will be addressed in future work.
引用
收藏
页数:17
相关论文
共 26 条
[1]   RangeViT: Towards Vision Transformers for 3D Semantic Segmentation in Autonomous Driving [J].
Ando, Angelika ;
Gidaris, Spyros ;
Bursuc, Andrei ;
Puy, Gilles ;
Boulch, Alexandre ;
Marlet, Renaud .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :5240-5250
[2]   SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [J].
Behley, Jens ;
Garbade, Martin ;
Milioto, Andres ;
Quenzel, Jan ;
Behnke, Sven ;
Stachniss, Cyrill ;
Gall, Juergen .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9296-9306
[3]   Semantic Segmentation for Point Clouds via Semantic-Based Local Aggregation and Multi-Scale Global Pyramid [J].
Cao, Shipeng ;
Zhao, Huaici ;
Liu, Pengfei .
MACHINES, 2023, 11 (01)
[4]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[5]  
Cheng H., 2022, P 2022 IEEE INT C MU, P1
[6]  
Cortinhal Tiago, 2020, Advances in Visual Computing. 15th International Symposium, ISVC 2020. Proceedings. Lecture Notes in Computer Science (LNCS 12510), P207, DOI 10.1007/978-3-030-64559-5_16
[7]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554
[8]   SCF-Net: Learning Spatial Contextual Features for Large-Scale Point Cloud Segmentation [J].
Fan, Siqi ;
Dong, Qiulei ;
Zhu, Fenghua ;
Lv, Yisheng ;
Ye, Peijun ;
Wang, Fei-Yue .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :14499-14508
[9]   TORNADO-Net: mulTiview tOtal vaRiatioN semAntic segmentation with Diamond inceptiOn module [J].
Gerdzhev, Martin ;
Razani, Ryan ;
Taghavi, Ehsan ;
Liu Bingbing .
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, :9543-9549
[10]   RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds [J].
Hu, Qingyong ;
Yang, Bo ;
Xie, Linhai ;
Rosa, Stefano ;
Guo, Yulan ;
Wang, Zhihua ;
Trigoni, Niki ;
Markham, Andrew .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11105-11114