Learning prototypes from background and latent objects for few-shot semantic segmentation

被引:0
作者
Wang, Yicong [1 ]
Huang, Rong [1 ,3 ]
Zhou, Shubo [1 ,3 ]
Jiang, Xueqin [1 ,3 ]
Fang, Zhijun [2 ]
机构
[1] Donghua Univ, Coll Informat Sci & Technol, Shanghai 201620, Peoples R China
[2] Donghua Univ, Sch Comp Sci & Technol, Shanghai 201620, Peoples R China
[3] Donghua Univ, Engn Res Ctr Digitized Text & Apparel Technol, Minist Educ, Shanghai 201620, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantic segmentation; Few-shot semantic segmentation; Prototype learning; Self-attention mechanism; NETWORK;
D O I
10.1016/j.knosys.2025.113218
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot semantic segmentation (FSS) aims to segment target object within a given image supported by few samples with pixel-level annotations. Existing FSS framework primarily focuses on target area for learning a target-object prototype while directly neglecting non-target clues. As such, the target-object prototype has not only to segment the target object but also to filter out non-target area simultaneously, resulting in numerous false positives. In this paper, we propose a background and latent-object prototype learning network (BLPLNet), which learns prototypes from not only the target area but also the non-target counterpart. From our perspective, the non-target area is delineated into background full of repeated textures and salient objects, refer to as latent objects in this paper. Specifically, a background mining module (BMM) is developed to specially learn a background prototype by episodic learning. The learned background prototype replaces the target-object one for background filtering, reducing the false positives. Moreover, a latent object mining module (LOMM), based on self-attention mechanism, works together with the BMM for learning multiple soft-orthogonal prototypes from latent objects. Then, the learned latent-object prototypes, which condense the general knowledge of objects, are used in a target object enhancement module (TOEM) to enhance the target-object prototype with the guidance of affinity-based scores. Extensive experiments on PASCAL-5i and COCO-20i datasets demonstrate the superiority of the BLPLNet, which outperforms state-of-the-art methods by an average of 0.60% on PASCAL5i. Ablation studies validate the effectiveness of each component, and visualization results indicate that the learned latent-object prototypes indeed convey the general knowledge of objects.
引用
收藏
页数:11
相关论文
共 55 条
[11]  
He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]
[12]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[13]   Strip Pooling: Rethinking Spatial Pooling for Scene Parsing [J].
Hou, Qibin ;
Zhang, Li ;
Cheng, Ming-Ming ;
Feng, Jiashi .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4002-4011
[14]   Densely Connected Convolutional Networks [J].
Huang, Gao ;
Liu, Zhuang ;
van der Maaten, Laurens ;
Weinberger, Kilian Q. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269
[15]   CCNet: Criss-Cross Attention for Semantic Segmentation [J].
Huang, Zilong ;
Wang, Xinggang ;
Huang, Lichao ;
Huang, Chang ;
Wei, Yunchao ;
Liu, Wenyu .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :603-612
[16]  
Ke Lei, 2023, Advances in Neural Information Processing Systems
[17]   Feature Weighting and Boosting for Few-Shot Segmentation [J].
Khoi Nguyen ;
Todorovic, Sinisa .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :622-631
[18]   Segment Anything [J].
Kirillov, Alexander ;
Mintun, Eric ;
Ravi, Nikhila ;
Mao, Hanzi ;
Rolland, Chloe ;
Gustafson, Laura ;
Xiao, Tete ;
Whitehead, Spencer ;
Berg, Alexander C. ;
Lo, Wan-Yen ;
Dolla'r, Piotr ;
Girshick, Ross .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, :3992-4003
[19]   Retain and Recover: Delving Into Information Loss for Few-Shot Segmentation [J].
Lang, Chunbo ;
Cheng, Gong ;
Tu, Binfei ;
Li, Chao ;
Han, Junwei .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 :5353-5365
[20]   Learning What Not to Segment: A New Perspective on Few-Shot Segmentation [J].
Lang, Chunbo ;
Cheng, Gong ;
Tu, Binfei ;
Han, Junwei .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :8047-8057