RGAM: A novel network architecture for 3D point cloud semantic segmentation in indoor scenes

被引:28
作者
Chen, Xue-Tao [1 ,2 ]
Li, Ying [1 ,2 ]
Fan, Jia-Hao [1 ,2 ]
Wang, Rui [3 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Jilin Univ, Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun 130012, Peoples R China
[3] Space Technol Jilin Ltd Co, Jilin 132013, Jilin, Peoples R China
关键词
3D Point cloud; Semantic segmentation; Deep neural network; Attention mechanism; CLASSIFICATION;
D O I
10.1016/j.ins.2021.04.069
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Three-dimensional (3D) point cloud semantic segmentation is an essential part of computer vision for scene comprehension. Nevertheless, due to their loss of detail, existing networks lack the ability to recognize complex scenes. This paper proposes a novel network architecture, called the ring grouping neural network with attention module (RGAM), which presents four improvements over the existing networks. First, novel multi-scale ring grouping learning is designed to extract the multi-scale neighborhood features without overlapped sampling, allowing the network to adapt to objects of different scales. Second, neighborhood information fusion is defined as the weighted sum of multiple neighborhood features, enabling the representation of each point to be considered in different neighborhoods. Third, in the global view, a spatial attention module is introduced among the neighborhoods, allowing long-range contextual information to be exploited for 3D point cloud semantic segmentation. Finally, a channel attention module is appended to the RGAM: the correlation of each channel with key information enhances the complex scene recognition ability of the RGAM. Experimental results on the challenging S3DIS, ScanNet, and NYU-V2 datasets demonstrate that the RGAM has stronger recognition ability than the existing networks based on several state-of-the-art algorithms for 3D point cloud semantic segmentation. (c) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:87 / 103
页数:17
相关论文
共 50 条
  • [1] [Anonymous], 1962, PRINCIPLES NEURODYNA
  • [2] [Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.170
  • [3] A future intelligent traffic system with mixed autonomous vehicles and human-driven vehicles
    Chen, Bokui
    Sun, Duo
    Zhou, Jun
    Wong, Wengfai
    Ding, Zhongjun
    [J]. INFORMATION SCIENCES, 2020, 529 : 59 - 72
  • [4] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [5] Fast neighbor search by using revised k-d tree
    Chen, Yewang
    Zhou, Lida
    Tang, Yi
    Singh, Jai Puneet
    Bouguila, Nizar
    Wang, Cheng
    Wang, Huazhen
    Du, Jixiang
    [J]. INFORMATION SCIENCES, 2019, 472 : 145 - 162
  • [6] Couprie C., 2013, P INT C LEARN REPR
  • [7] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
    Dai, Angela
    Qi, Charles Ruizhongtai
    Niessner, Matthias
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6545 - 6554
  • [8] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
    Dai, Angela
    Chang, Angel X.
    Savva, Manolis
    Halber, Maciej
    Funkhouser, Thomas
    Niessner, Matthias
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2432 - 2443
  • [9] Donahue J, 2014, PR MACH LEARN RES, V32
  • [10] Rich feature hierarchies for accurate object detection and semantic segmentation
    Girshick, Ross
    Donahue, Jeff
    Darrell, Trevor
    Malik, Jitendra
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587