Omni-supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning

被引：38

作者：

Gong, Jingyu ^{[1
]}

Xu, Jiachen ^{[1
]}

Tan, Xin ^{[1
]}

Song, Haichuan ^{[2
]}

Qu, Yanyun ^{[3
]}

Xie, Yuan ^{[2
]}

Ma, Lizhuang ^{[1
,2
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China

[2] East China Normal Univ, Sch Comp Sci & Technol, Shanghai, Peoples R China

[3] Xiamen Univ, Sch Informat, Xiamen, Fujian, Peoples R China

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

基金：

上海市自然科学基金; 中国国家自然科学基金;

关键词：

D O I：

10.1109/CVPR46437.2021.01150

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Hidden features in neural network usually fail to learn informative representation for 3D segmentation as supervisions are only given on output prediction, while this can be solved by omni-scale supervision on intermediate layers. In this paper, we bring the first omni-scale supervision method to point cloud segmentation via the proposed gradual Receptive Field Component Reasoning (RFCR), where target Receptive Field Component Codes (RFCCs) are designed to record categories within receptive fields for hidden units in the encoder. Then, target RFCCs will supervise the decoder to gradually infer the RFCCs in a coarse-to-fine categories reasoning manner, and finally obtain the semantic labels. Because many hidden features are inactive with tiny magnitude and make minor contributions to RFCC prediction, we propose a Feature Densification with a centrifugal potential to obtain more unambiguous features, and it is in effect equivalent to entropy regularization over features. More active features can further unleash the potential of our omni-supervision method. We embed our method into four prevailing backbones and test on three challenging benchmarks. Our method can significantly improve the backbones in all three datasets. Specifically, our method brings new state-of-the-art performances for S3DIS as well as Semantic3D and ranks the 1st in the ScanNet benchmark among all the point-based methods. Code is publicly available at https://github.com/azuki-miho/RFCR.

引用

页码：11668 / 11677

页数：10

共 44 条

[1]

[Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.170

[2]

[Anonymous], 2020, AAAI

[3]

Bengio Y., 2005, Advances in Neural Information Processing Systems (NeurIPS)

[4]

Caran KL, 2013, SOFT FIBRILLAR MATERIALS: FABRICATION AND APPLICATIONS, P3

[5] 3DMV: Joint 3D-Multi-view Prediction for 3D Semantic Scene Segmentation [J].

Dai, Angela ;

Niessner, Matthias .

COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 :458-474

[6] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[7] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J].

Dai, Angela ;

Chang, Angel X. ;

Savva, Manolis ;

Halber, Maciej ;

Funkhouser, Thomas ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2432-2443

[8] Deep FusionNet for Point Cloud Semantic Segmentation [J].

Zhang, Feihu ;

Fang, Jin ;

Wah, Benjamin ;

Torr, Philip .

COMPUTER VISION - ECCV 2020, PT XXIV, 2020, 12369 :644-663

[9]

Gong Jingyu, 2021, AAAI

[10] 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks [J].

Graham, Benjamin ;

Engelcke, Martin ;

van der Maaten, Laurens .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :9224-9232

← 1 2 3 4 5 →