A Category-Contrastive Guided-Graph Convolutional Network Approach for the Semantic Segmentation of Point Clouds

被引:3
作者
Wang, Xuzhe [1 ]
Yang, Juntao [1 ,2 ]
Kang, Zhizhong [2 ,3 ,4 ]
Du, Junjian [1 ]
Tao, Zhaotong [1 ]
Qiao, Dan [1 ]
机构
[1] Shandong Univ Sci & Technol, Coll Geodesy & Geomatics, Qingdao 266590, Peoples R China
[2] Minist Educ Peoples Republ China, Ctr Space Explorat, Subctr Int Cooperat & Res Lunar & Planetary Explor, Beijing 100083, Peoples R China
[3] China Univ Geosci, Sch Land Sci & Technol, Beijing 100083, Peoples R China
[4] China Univ Geosci, Lunar & Planetary Remote Sensing Explorat Res Ctr, Beijing 100083, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Point cloud compression; Semantic segmentation; Convolutional neural networks; Three-dimensional displays; Convolution; Laser radar; Attention mechanism; contrastive learning; graph convolutional network; light detection and ranging (LiDAR); semantic segmentation; CLASSIFICATION; RECOGNITION;
D O I
10.1109/JSTARS.2023.3264240
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The semantic segmentation of light detection and ranging (LiDAR) point clouds plays an important role in 3-D scene intelligent perception and semantic modeling. The unstructured, sparse and uneven characteristics of point clouds pose great challenges to the representation of the local geometric shapes, which degrades semantic segmentation performance. To address the challenges of describing local geometric shapes due to unstructured and sparse 3-D point clouds, this article proposes a category-contrastive-guided graph convolutional network (CGGC-Net) for the semantic segmentation of LiDAR point clouds. First, a detailed geometric structure of the raw point clouds is encoded to represent the inherent geometric pattern within the local neighborhood. At the same time, the geometric structures information is transmitted across multiple layers, so that the geometric structure encoding information containing different receptive fields and richer neighborhood spatial structure can be aggregated. Following this, the graph convolution neural network uses the edge convolution layer to adaptively describe the semantic correlation between the query point and its neighboring points, and combines the attention mechanism to gather the surrounding feature information to the query point. As a result, the graph convolution neural network and attention mechanism are iteratively stacked for the aggregation and fusion of spatial context semantic information, to generate highly discriminative semantic feature representation. Finally, the superparameters of the model are learned through a multitask optimization strategy guided by category-aware contrastive loss and cross-entropy loss. Experiments are conducted on the public SemanticKITTI dataset and the Stanford large-scale 3-D Indoor Spaces dataset to demonstrate the effectiveness and reliability of the proposed CGGC-Net from both quantitative and qualitative perspectives. The results indicate its capability of automatically classifying LiDAR point clouds, with a mean intersection-over-union of 58.4%. Moreover, multiple comparative experiments also demonstrate the superior performance of the proposed method, exceeding state-of-the-art methods.
引用
收藏
页码:3715 / 3729
页数:15
相关论文
共 64 条
[1]   3D Semantic Parsing of Large-Scale Indoor Spaces [J].
Armeni, Iro ;
Sener, Ozan ;
Zamir, Amir R. ;
Jiang, Helen ;
Brilakis, Ioannis ;
Fischer, Martin ;
Savarese, Silvio .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1534-1543
[2]   A Survey on 3D Object Detection Methods for Autonomous Driving Applications [J].
Arnold, Eduardo ;
Al-Jarrah, Omar Y. ;
Dianati, Mehrdad ;
Fallah, Saber ;
Oxtoby, David ;
Mouzakitis, Alex .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2019, 20 (10) :3782-3795
[3]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[4]   SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [J].
Behley, Jens ;
Garbade, Martin ;
Milioto, Andres ;
Quenzel, Jan ;
Behnke, Sven ;
Stachniss, Cyrill ;
Gall, Juergen .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9296-9306
[5]   Review: Deep Learning on 3D Point Clouds [J].
Bello, Saifullahi Aminu ;
Yu, Shangshu ;
Wang, Cheng ;
Adam, Jibril Muhmmad ;
Li, Jonathan .
REMOTE SENSING, 2020, 12 (11)
[6]   Graph Neural Networks in Network Neuroscience [J].
Bessadok, Alaa ;
Mahjoub, Mohamed Ali ;
Rekik, Islem .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) :5833-5848
[7]   SnapNet: 3D point cloud semantic labeling with 2D deep segmentation networks [J].
Boulch, Alexandre ;
Guerry, Yids ;
Le Saux, Bertrand ;
Audebert, Nicolas .
COMPUTERS & GRAPHICS-UK, 2018, 71 :189-198
[8]   Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting [J].
Bouritsas, Giorgos ;
Frasca, Fabrizio ;
Zafeiriou, Stefanos ;
Bronstein, Michael M. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) :657-668
[9]   Multi-View 3D Object Detection Network for Autonomous Driving [J].
Chen, Xiaozhi ;
Ma, Huimin ;
Wan, Ji ;
Li, Bo ;
Xia, Tian .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6526-6534
[10]   4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks [J].
Choy, Christopher ;
Gwak, JunYoung ;
Savarese, Silvio .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3070-3079