SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation

被引:340
作者
Zhang, Xiaolin [1 ]
Wei, Yunchao [1 ]
Yang, Yi [1 ]
Huang, Thomas S. [2 ,3 ]
机构
[1] Univ Technol Sydney, Ctr Artificial Intelligence, ReLER Lab, Sydney, NSW 2007, Australia
[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61901 USA
[3] Univ Illinois, Beckman Inst, Urbana, IL 61901 USA
基金
澳大利亚研究理事会;
关键词
Image segmentation; Feature extraction; Testing; Semantics; Training; Task analysis; Dogs; Few-shot learning; image segmentation; neural networks; siamese network;
D O I
10.1109/TCYB.2020.2992433
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One-shot image semantic segmentation poses a challenging task of recognizing the object regions from unseen categories with only one annotated example as supervision. In this article, we propose a simple yet effective similarity guidance network to tackle the one-shot (SG-One) segmentation problem. We aim at predicting the segmentation mask of a query image with the reference to one densely labeled support image of the same category. To obtain the robust representative feature of the support image, we first adopt a masked average pooling strategy for producing the guidance features by only taking the pixels belonging to the support image into account. We then leverage the cosine similarity to build the relationship between the guidance features and features of pixels from the query image. In this way, the possibilities embedded in the produced similarity maps can be adopted to guide the process of segmenting objects. Furthermore, our SG-One is a unified framework that can efficiently process both support and query images within one network and be learned in an end-to-end manner. We conduct extensive experiments on Pascal VOC 2012. In particular, our SG-One achieves the mIoU score of 46.3%, surpassing the baseline methods.
引用
收藏
页码:3855 / 3865
页数:11
相关论文
共 53 条
[1]   Preserving Semantic Relations for Zero-Shot Learning [J].
Annadani, Yashas ;
Biswas, Soma .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7603-7612
[2]  
[Anonymous], 2014, INT C LEARN REPR ICL
[3]  
[Anonymous], 2016, IEEE T CYBERNETICS, DOI DOI 10.1109/TCYB.2015.2453091
[4]  
[Anonymous], 2015, IEEE T CYBERNETICS, DOI DOI 10.1109/TCYB.2014.2324815
[5]  
[Anonymous], 2015, LECT NOTES COMPUT SC, DOI DOI 10.1007/978-3-319-24574-4_28
[6]   What's the Point: Semantic Segmentation with Point Supervision [J].
Bearman, Amy ;
Russakovsky, Olga ;
Ferrari, Vittorio ;
Fei-Fei, Li .
COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 :549-565
[7]   Coupled Bilinear Discriminant Projection for Cross-View Gait Recognition [J].
Ben, Xianye ;
Gong, Chen ;
Zhang, Peng ;
Yan, Rui ;
Wu, Qiang ;
Meng, Weixiao .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (03) :734-747
[8]   One-Shot Video Object Segmentation [J].
Caelles, S. ;
Maninis, K. -K. ;
Pont-Tuset, J. ;
Leal-Taixe, L. ;
Cremers, D. ;
Van Gool, L. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5320-5329
[9]   CaMap: Camera-based Map Manipulation on Mobile Devices [J].
Chen, Liang ;
Chen, Dongyi .
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,
[10]   SPGNet: Semantic Prediction Guidance for Scene Parsing [J].
Cheng, Bowen ;
Chen, Liang-Chieh ;
Wei, Yunchao ;
Zhu, Yukun ;
Huang, Zilong ;
Xiong, Jinjun ;
Huang, Thomas S. ;
Hwu, Wen-Mei ;
Shi, Honghui .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :5217-5227