Query semantic reconstruction for background in few-shot segmentation

被引:4
作者
Guan, Haoyan [1 ]
Spratling, Michael [1 ]
机构
[1] Kings Coll London, Dept Informat, London WC2B 4BG, England
关键词
Few-shot learning; Semantic segmentation; Metric learning;
D O I
10.1007/s00371-023-02817-x
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Few-shot segmentation (FSS) aims to segment unseen classes using a few annotated samples. Typically, a prototype representing the foreground class is extracted from annotated support image(s) and is matched to features representing each pixel in the query image. However, models learnt in this way are insufficiently discriminatory, and often produce false positives: misclassifying background pixels as foreground. Some FSS methods try to address this issue by using the background in the support image(s) to help identify the background in the query image. However, the backgrounds of these images are often quite distinct, and hence, the support image background information is uninformative. This article proposes a method, QSR, that extracts the background from the query image itself, and as a result is better able to discriminate between foreground and background features in the query image. This is achieved by modifying the training process to associate prototypes with class labels including known classes from the training data and latent classes representing unknown background objects. This class information is then used to extract a background prototype from the query image. To successfully associate prototypes with class labels and extract a background prototype that is capable of predicting a mask for the background regions of the image, the machinery for extracting and using foreground prototypes is induced to become more discriminative between different classes. Experiments achieves state-of-the-art results for both 1-shot and 5-shot FSS on the PASCAL-5(i) and COCO-20(i) dataset. As QSR operates only during training, results are produced with no extra computational complexity during testing.
引用
收藏
页码:799 / 810
页数:12
相关论文
共 38 条
[1]   Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need? [J].
Boudiaf, Malik ;
Kervadec, Hoel ;
Masud, Ziko Imtiaz ;
Piantanida, Pablo ;
Ben Ayed, Ismail ;
Dolz, Jose .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13974-13983
[2]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[3]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[4]   Exploring Simple Siamese Representation Learning [J].
Chen, Xinlei ;
He, Kaiming .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :15745-15753
[5]   Multi-Level Semantic Feature Augmentation for One-Shot Learning [J].
Chen, Zitian ;
Fu, Yanwei ;
Zhang, Yinda ;
Jiang, Yu-Gang ;
Xue, Xiangyang ;
Sigal, Leonid .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (09) :4594-4605
[6]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[7]  
Finn C, 2017, PR MACH LEARN RES, V70
[8]  
Grant Erin, 2018, INT C LEARN REPR
[9]   Low-shot Visual Recognition by Shrinking and Hallucinating Features [J].
Hariharan, Bharath ;
Girshick, Ross .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :3037-3046
[10]   Simultaneous Detection and Segmentation [J].
Hariharan, Bharath ;
Arbelaez, Pablo ;
Girshick, Ross ;
Malik, Jitendra .
COMPUTER VISION - ECCV 2014, PT VII, 2014, 8695 :297-312