Base and Meta: A New Perspective on Few-Shot Segmentation

被引:113
作者
Lang, Chunbo [1 ]
Cheng, Gong [1 ]
Tu, Binfei [1 ]
Li, Chao [2 ]
Han, Junwei [1 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Xian 710060, Peoples R China
[2] Zhejiang Lab, Hangzhou 310058, Peoples R China
基金
中国国家自然科学基金;
关键词
Few-shot learning; few-shot segmentation; semantic segmentation; 3D point cloud segmentation;
D O I
10.1109/TPAMI.2023.3265865
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the progress made by few-shot segmentation (FSS) in low-data regimes, the generalization capability of most previous works could be fragile when countering hard query samples with seen-class objects. This paper proposes a fresh and powerful scheme to tackle such an intractable bias problem, dubbed base and meta (BAM). Concretely, we apply an auxiliary branch (base learner) to the conventional FSS framework (meta learner) to explicitly identify base-class objects, i.e., the regions that do not need to be segmented. Then, the coarse results output by these two learners in parallel are adaptively integrated to derive accurate segmentation predictions. Considering the sensitivity of meta learner, we further introduce adjustment factors to estimate the scene differences between support and query image pairs from both style and appearance perspectives, so as to facilitate the model ensemble forecasting. The remarkable performance gains on standard benchmarks (PASCAL-5', COCO-20', and FSS-1000) manifest the effectiveness, and surprisingly, our versatile scheme sets new state-of-the-arts even with two plain learners. Further-more, in light of its unique nature, we also discuss several more practical but challenging extensions, including generalized FSS, 3D point cloud FSS, class-agnostic FSS, cross-domain FSS, weak-label FSS, and zero-shot segmentation. Our source code is available at https://github.com/chunbolang/BAM.
引用
收藏
页码:10669 / 10686
页数:18
相关论文
共 112 条
[1]  
Gatys LA, 2015, Arxiv, DOI [arXiv:1508.06576, DOI 10.48550/ARXIV.1508.06576]
[2]   3D Semantic Parsing of Large-Scale Indoor Spaces [J].
Armeni, Iro ;
Sener, Ozan ;
Zamir, Amir R. ;
Jiang, Helen ;
Brilakis, Ioannis ;
Fischer, Martin ;
Savarese, Silvio .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1534-1543
[3]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[4]   Fully-Convolutional Siamese Networks for Object Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Henriques, Joao F. ;
Vedaldi, Andrea ;
Torr, Philip H. S. .
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865
[5]   Learning Discriminative Model Prediction for Tracking [J].
Bhat, Goutam ;
Danelljan, Martin ;
Van Gool, Luc ;
Timofte, Radu .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6181-6190
[6]  
Bolya D, 2020, Arxiv, DOI arXiv:1912.06218
[7]   YOLACT Real-time Instance Segmentation [J].
Bolya, Daniel ;
Zhou, Chong ;
Xiao, Fanyi ;
Lee, Yong Jae .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9156-9165
[8]  
Brunelli Roberto, 2009, Template matching techniques in computer vision: theory and practice
[9]   Adversarial Reciprocal Points Learning for Open Set Recognition [J].
Chen, Guangyao ;
Peng, Peixi ;
Wang, Xiangqian ;
Tian, Yonghong .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) :8065-8081
[10]   Learning Open Set Network with Discriminative Reciprocal Points [J].
Chen, Guangyao ;
Qiao, Limeng ;
Shi, Yemin ;
Peng, Peixi ;
Li, Jia ;
Huang, Tiejun ;
Pu, Shiliang ;
Tian, Yonghong .
COMPUTER VISION - ECCV 2020, PT III, 2020, 12348 :507-522