Point cloud semantic segmentation with adaptive spatial structure graph transformer

被引:6
作者
Han, Ting [1 ]
Chen, Yiping [1 ]
Ma, Jin [1 ]
Liu, Xiaoxue [2 ]
Zhang, Wuming [1 ]
Zhang, Xinchang [3 ,4 ,5 ]
Wang, Huajuan [6 ]
机构
[1] Sun Yat Sen Univ, Sch Geospatial Engn & Sci, Zhuhai 519082, Peoples R China
[2] Xiamen Univ, Fujian Key Lab Sensing & Comp Smart Cities, Xiamen 361005, Peoples R China
[3] Guangzhou Univ, Sch Geog & Remote Sensing, Guangzhou 510006, Peoples R China
[4] Xinjiang Univ, Coll Geog & Remote Sensing Sci, Urumqi 830046, Peoples R China
[5] Guangdong Urban & Rural Planning & Construct Intel, Guangzhou 511300, Peoples R China
[6] Zhuhai Surveying & Mapping Inst, Zhuhai 519000, Peoples R China
关键词
Graph transformer; Point cloud; LiDAR; Semantic segmentation; Deep learning; NETWORKS;
D O I
10.1016/j.jag.2024.104105
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
With the rapid development of LiDAR and artificial intelligence technologies, 3D point cloud semantic segmentation has become a highlight research topic. This technology is able to significantly enhance the capabilities of building information modeling, navigation and environmental perception. However, current deep learning-based methods primarily rely on voxelization or multi-layer convolution for feature extraction. These methods often face challenges in effectively differentiating between homogeneous objects or structurally adherent targets in complex real-world scenes. To this end, we propose a Graph Transformer point cloud semantic segmentation network (ASGFormer) tailored for structurally adherent objects. Firstly, ASGFormer combines Graph and Transformer to promote global correlation understanding in the graph. Secondly, spatial index and position embedding are constructed based on distance relationships and feature differences. Through a learnable mechanism, the structural weights between points are dynamically adjusted, achieving adaptive spatial structure within the graph. Finally, dummy nodes are introduced to facilitate global information storage and transmission between layers, effectively addressing the issue of information loss at the terminal nodes of the graph. Comprehensive experiments are conducted on the various real-world 3D point cloud datasets, analyzing the effectiveness of proposed ASGFormer through qualitative and quantitative evaluations. ASGFormer outperforms existing approaches with of 91.3% for OA, 78.0% for mAcc, and 72.3% for mIoU on S3DIS dataset. Moreover, ASGFormer achieves 72.8%, 45.5%, 81.6%, 70.1% mIoU on ScanNet, City-Facade, Toronto 3D and Semantic KITTI dataset, respectively. Notably, the proposed method demonstrates effective differentiation of homogeneous structurally adherent objects, further contributing to the intelligent perception and modeling of complex scenes.
引用
收藏
页数:19
相关论文
共 111 条
[1]   3D Semantic Parsing of Large-Scale Indoor Spaces [J].
Armeni, Iro ;
Sener, Ozan ;
Zamir, Amir R. ;
Jiang, Helen ;
Brilakis, Ioannis ;
Fischer, Martin ;
Savarese, Silvio .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1534-1543
[2]   SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [J].
Behley, Jens ;
Garbade, Martin ;
Milioto, Andres ;
Quenzel, Jan ;
Behnke, Sven ;
Stachniss, Cyrill ;
Gall, Juergen .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9296-9306
[3]  
Cho HYC, 2019, Arxiv, DOI arXiv:1811.09794
[4]  
Choromanski K, 2021, Arxiv, DOI arXiv:2009.14794
[5]   4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks [J].
Choy, Christopher ;
Gwak, JunYoung ;
Savarese, Silvio .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3070-3079
[7]   Geometric attentional dynamic graph convolutional neural networks for point cloud analysis [J].
Cui, Yiming ;
Liu, Xin ;
Liu, Hongmin ;
Zhang, Jiyong ;
Zare, Alina ;
Fan, Bin .
NEUROCOMPUTING, 2021, 432 :300-310
[8]   ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J].
Dai, Angela ;
Chang, Angel X. ;
Savva, Manolis ;
Halber, Maciej ;
Funkhouser, Thomas ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2432-2443
[9]   Siamese KPConv: 3D multiple change detection from raw point clouds using deep learning [J].
de Gelis, Iris ;
Lefevre, Sebastien ;
Corpetti, Thomas .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2023, 197 :274-291
[10]  
Diao CMR, 2023, Arxiv, DOI arXiv:2210.05062