RGAM: A novel network architecture for 3D point cloud semantic segmentation in indoor scenes

被引：29

作者：

Chen, Xue-Tao ^{[1
,2
]}

Li, Ying ^{[1
,2
]}

Fan, Jia-Hao ^{[1
,2
]}

Wang, Rui ^{[3
]}

机构：

[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China

[2] Jilin Univ, Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun 130012, Peoples R China

[3] Space Technol Jilin Ltd Co, Jilin 132013, Jilin, Peoples R China

来源：

INFORMATION SCIENCES | 2021年 / 571卷

关键词：

3D Point cloud; Semantic segmentation; Deep neural network; Attention mechanism; CLASSIFICATION;

D O I：

10.1016/j.ins.2021.04.069

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Three-dimensional (3D) point cloud semantic segmentation is an essential part of computer vision for scene comprehension. Nevertheless, due to their loss of detail, existing networks lack the ability to recognize complex scenes. This paper proposes a novel network architecture, called the ring grouping neural network with attention module (RGAM), which presents four improvements over the existing networks. First, novel multi-scale ring grouping learning is designed to extract the multi-scale neighborhood features without overlapped sampling, allowing the network to adapt to objects of different scales. Second, neighborhood information fusion is defined as the weighted sum of multiple neighborhood features, enabling the representation of each point to be considered in different neighborhoods. Third, in the global view, a spatial attention module is introduced among the neighborhoods, allowing long-range contextual information to be exploited for 3D point cloud semantic segmentation. Finally, a channel attention module is appended to the RGAM: the correlation of each channel with key information enhances the complex scene recognition ability of the RGAM. Experimental results on the challenging S3DIS, ScanNet, and NYU-V2 datasets demonstrate that the RGAM has stronger recognition ability than the existing networks based on several state-of-the-art algorithms for 3D point cloud semantic segmentation. (c) 2021 Elsevier Inc. All rights reserved.

引用

页码：87 / 103

页数：17

共 50 条

[1]

[Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.170

[2] A future intelligent traffic system with mixed autonomous vehicles and human-driven vehicles [J].

Chen, Bokui ;

Sun, Duo ;

Zhou, Jun ;

Wong, Wengfai ;

Ding, Zhongjun .

INFORMATION SCIENCES, 2020, 529 :59-72

[3] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[4] Fast neighbor search by using revised k-d tree [J].

Chen, Yewang ;

Zhou, Lida ;

Tang, Yi ;

Singh, Jai Puneet ;

Bouguila, Nizar ;

Wang, Cheng ;

Wang, Huazhen ;

Du, Jixiang .

INFORMATION SCIENCES, 2019, 472 :145-162

[5]

Couprie C, 2013, ARXIV13013572

[6] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[7] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J].

Dai, Angela ;

Chang, Angel X. ;

Savva, Manolis ;

Halber, Maciej ;

Funkhouser, Thomas ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2432-2443

[8]

Donahue J, 2014, PR MACH LEARN RES, V32

[9] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

[10] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

← 1 2 3 4 5 →