3D Guided Weakly Supervised Semantic Segmentation

被引:0
作者
Sun, Weixuan [1 ,2 ]
Zhang, Jing [1 ,2 ]
Barnes, Nick [1 ]
机构
[1] Australian Natl Univ, Canberra, ACT, Australia
[2] CSIRO, Data61, Canberra, ACT, Australia
来源
COMPUTER VISION - ACCV 2020, PT I | 2021年 / 12622卷
关键词
Semantic segmentation; Weak supervision; 3D guidance;
D O I
10.1007/978-3-030-69525-5_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pixel-wise clean annotation is necessary for fully-supervised semantic segmentation, which is laborious and expensive to obtain. In this paper, we propose a weakly supervised 2D semantic segmentation model by incorporating sparse bounding box labels with available 3D information, which is much easier to obtain with advanced sensors. We introduce a 2D-3D inference module to generate accurate pixel-wise segment proposal masks. Guided by 3D information, we first generate a point cloud of objects and calculate a per class objectness probability score for each point using projected bounding-boxes. Then we project the point cloud with objectness probabilities back to the 2D images followed by a refinement step to obtain segment proposals, which are treated as pseudo labels to train a semantic segmentation network. Our method works in a recursive manner to gradually refine the above-mentioned segment proposals. We conducted extensive experimental results on the 2D-3D-S dataset where we manually labeled a subset of images with bounding boxes. We show that the proposed method can generate accurate segment proposals when bounding box labels are available on only a small subset of training images. Performance comparison with recent state-of-the-art methods further illustrates the effectiveness of our method.
引用
收藏
页码:585 / 602
页数:18
相关论文
共 55 条
[1]   Learning Pixel-level Semantic Affinity with Image-level Supervision forWeakly Supervised Semantic Segmentation [J].
Ahn, Jiwoon ;
Kwak, Suha .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4981-4990
[2]  
[Anonymous], 2015, ArXiv e-prints
[3]  
Armeni Iro, 2017, arXiv preprint arXiv:1702.01105
[4]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[5]   What's the Point: Semantic Segmentation with Point Supervision [J].
Bearman, Amy ;
Russakovsky, Olga ;
Ferrari, Vittorio ;
Fei-Fei, Li .
COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 :549-565
[6]   Matterport3D: Learning from RGB-D Data in Indoor Environments [J].
Chang, Angel ;
Dai, Angela ;
Funkhouser, Thomas ;
Halber, Maciej ;
Niessner, Matthias ;
Savva, Manolis ;
Song, Shuran ;
Zeng, Andy ;
Zhang, Yinda .
PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2017, :667-676
[7]  
Chen DZ, 2020, Arxiv, DOI arXiv:1912.08830
[8]  
Chen L. C., 2014, ICLR
[9]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[10]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848