RGAM: A novel network architecture for 3D point cloud semantic segmentation in indoor scenes

被引:29
作者
Chen, Xue-Tao [1 ,2 ]
Li, Ying [1 ,2 ]
Fan, Jia-Hao [1 ,2 ]
Wang, Rui [3 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Jilin Univ, Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun 130012, Peoples R China
[3] Space Technol Jilin Ltd Co, Jilin 132013, Jilin, Peoples R China
关键词
3D Point cloud; Semantic segmentation; Deep neural network; Attention mechanism; CLASSIFICATION;
D O I
10.1016/j.ins.2021.04.069
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Three-dimensional (3D) point cloud semantic segmentation is an essential part of computer vision for scene comprehension. Nevertheless, due to their loss of detail, existing networks lack the ability to recognize complex scenes. This paper proposes a novel network architecture, called the ring grouping neural network with attention module (RGAM), which presents four improvements over the existing networks. First, novel multi-scale ring grouping learning is designed to extract the multi-scale neighborhood features without overlapped sampling, allowing the network to adapt to objects of different scales. Second, neighborhood information fusion is defined as the weighted sum of multiple neighborhood features, enabling the representation of each point to be considered in different neighborhoods. Third, in the global view, a spatial attention module is introduced among the neighborhoods, allowing long-range contextual information to be exploited for 3D point cloud semantic segmentation. Finally, a channel attention module is appended to the RGAM: the correlation of each channel with key information enhances the complex scene recognition ability of the RGAM. Experimental results on the challenging S3DIS, ScanNet, and NYU-V2 datasets demonstrate that the RGAM has stronger recognition ability than the existing networks based on several state-of-the-art algorithms for 3D point cloud semantic segmentation. (c) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:87 / 103
页数:17
相关论文
共 50 条
[41]   VoxSegNet: Volumetric CNNs for Semantic Part Segmentation of 3D Shapes [J].
Wang, Zongji ;
Lu, Feng .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2020, 26 (09) :2919-2930
[42]   Context-driven automated target detection in 3-D data [J].
West, KF ;
Webb, BN ;
Lersch, JR ;
Pothier, S ;
Triscari, JM ;
Iverson, AE .
AUTOMATIC TARGET RECOGNITION XIV, 2004, 5426 :133-143
[43]   Enhancing Semantic Segmentation for Robotics: The Power of 3-D Entangled Forests [J].
Wolf, Daniel ;
Prankl, Johann ;
Vincze, Markus .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2016, 1 (01) :49-56
[44]   Thinning algorithms based on quadtree and octree representations [J].
Wong, WT ;
Shih, FY ;
Su, TF .
INFORMATION SCIENCES, 2006, 176 (10) :1379-1394
[45]   Segmentation-based classification for 3D point clouds in the road environment [J].
Xiang, Binbin ;
Yao, Jian ;
Lu, Xiaohu ;
Li, Li ;
Xie, Renping ;
Li, Jie .
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2018, 39 (19) :6182-6212
[46]   Time-varying Nonholonomic Robot Consensus Formation Using Model Predictive Based Protocol With Switching Topology [J].
Xiao, Hanzhen ;
Chen, C. L. Philip .
INFORMATION SCIENCES, 2021, 567 :201-215
[47]   A turning point-based offline map matching algorithm for urban road networks [J].
Zhang, Dongqing ;
Dong, Yucheng ;
Guo, Zhaoxia .
INFORMATION SCIENCES, 2021, 565 :32-45
[48]   Discriminative-Dictionary-Learning-Based Multilevel Point-Cluster Features for ALS Point-Cloud Classification [J].
Zhang, Zhenxin ;
Zhang, Liqiang ;
Tong, Xiaohua ;
Guo, Bo ;
Zhang, Liang ;
Xing, Xiaoyue .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (12) :7309-7322
[49]   PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing [J].
Zhao, Hengshuang ;
Jiang, Li ;
Fu, Chi-Wing ;
Jia, Jiaya .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5550-5558
[50]   3D shape classification and retrieval based on polar view [J].
Zhou, Yan ;
Zeng, Fanzhi ;
Qian, Jiechang ;
Han, Xintong .
INFORMATION SCIENCES, 2019, 474 :205-220