Incorporating Depth Information into Few-Shot Semantic Segmentation

被引:8
作者
Zhang, Yifei [1 ]
Sidibe, Desire [2 ]
Morel, Olivier [1 ]
Meriaudeau, Fabrice [1 ]
机构
[1] Univ Bourgogne Franche Comt, ERL VIBOT CNRS 6000, ImViA, F-71200 Le Creusot, France
[2] Univ Paris Saclay, IBISC, Univ Evry, F-91020 Evry, France
来源
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2021年
关键词
D O I
10.1109/ICPR48806.2021.9412921
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot segmentation presents a significant challenge for semantic scene understanding under limited supervision. Namely, this task targets at generalizing the segmentation ability of the model to new categories given a few samples. In order to obtain complete scene information, we extend the RGB-centric methods to take advantage of complementary depth information. In this paper, we propose a two-stream deep neural network based on metric learning. Our method, known as RDNet, learns class-specific prototype representations within RGB and depth embedding spaces, respectively. The learned prototypes provide effective semantic guidance on the corresponding RGB and depth query image, leading to more accurate performance. Moreover, we build a novel outdoor scene dataset, known as Cityscapes-3(i), using labeled RGB images and depth images from the Cityscapes dataset. We also perform ablation studies to explore the effective use of depth information in few-shot segmentation tasks. Experiments on Cityscapes-3(i) show that our method achieves excellent results with visual and complementary geometric cues from only a few labeled examples.
引用
收藏
页码:3582 / 3588
页数:7
相关论文
共 31 条
[1]  
[Anonymous], 2016, 4 INT C LEARN REPR S
[2]  
[Anonymous], 2018, P BMVC
[3]   Multimodal Machine Learning: A Survey and Taxonomy [J].
Baltrusaitis, Tadas ;
Ahuja, Chaitanya ;
Morency, Louis-Philippe .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (02) :423-443
[4]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[5]   Locality-Sensitive Deconvolution Networks with Gated Fusion for RGB-D Indoor Semantic Segmentation [J].
Cheng, Yanhua ;
Cai, Rui ;
Li, Zhiwei ;
Zhao, Xin ;
Huang, Kaiqi .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1475-1483
[6]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[7]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554
[8]  
Deng Liuyuan, 2019, Rfbnet: Deep multimodal networks with residual fusion blocks for rgb-d semantic segmentation
[9]   Multi-scale Discriminative Location-aware Network for Few-Shot Semantic Segmentation [J].
Dong, Zihao ;
Zhang, Ruixun ;
Shao, Xiuli ;
Zhou, Hongyu .
2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 2, 2019, :42-47
[10]  
Garcia V., 2017, ARXIV PREPRINT ARXIV