Word vector embedding and self-supplementing network for Generalized Few-shot Semantic Segmentation

被引:0
作者
Wang, Xiaowei [1 ]
Chen, Qiong [1 ]
Yang, Yong [1 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Generalized Few-shot Semantic Segmentation; Inter-class interference; Semantic word embedding; Self-supplementing;
D O I
10.1016/j.neucom.2024.128737
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Under the condition of sufficient base class samples and a few novel class samples, Generalized Few- shot Semantic Segmentation (GFSS) classifies each pixel in the query image as base class, novel class, or background. A standard GFSS approach involves two training stages: base class learning and novel class updating. However, inter-class interference and information loss which contribute to the poor performance of GFSS, have not been synthetical considered. To address the problem, we propose an Embedded-Self- Supplementing Network (ESSNet), i.e., semantic word embedding and query set self-supplementing information to enhance segmentation accuracy. Specifically, the semantic word embedding module employs distance information between word vectors to assist the model in learning the distance between class prototypes. In order to transform the semantic word vector prototypes from the semantic space to the visual embedding space, we designed a triplet loss function to supervise the word vector embedding module, where the word vector prototype serves as an anchor and positive-negative samples are collected among the general features of the support image. To compensate for the information loss caused by using prototypes to represent classes, we propose a self-supplementing module to mine the information contained in the query image. Specifically, this module first makes a preliminary prediction on the query image, then selects high-confidence area to form pseudo labels, and finally uses pseudo labels to extract query prototypes to supplement the missing information. Extensive experiments on PASCAL-5(i) and COCO-20(i) show that ESSNet has superior performance and outperforms state-of-the-art methods in all settings.
引用
收藏
页数:10
相关论文
共 46 条
[1]   Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation [J].
Baek, Donghyeon ;
Oh, Youngmin ;
Ham, Bumsub .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9516-9525
[2]   Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need? [J].
Boudiaf, Malik ;
Kervadec, Hoel ;
Masud, Ziko Imtiaz ;
Piantanida, Pablo ;
Ben Ayed, Ismail ;
Dolz, Jose .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13974-13983
[3]  
Bucher M, 2019, ADV NEUR IN, V32
[4]   Dense affinity matching for Few-Shot Segmentation [J].
Chen, Hao ;
Dong, Yonghan ;
Lu, Zheming ;
Yu, Yunlong ;
Li, Yingming ;
Han, Jungong ;
Zhang, Zhongfei .
NEUROCOMPUTING, 2024, 577
[5]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[6]   Progressive Semantic-Guided Vision Transformer for Zero-Shot Learning [J].
Chen, Shiming ;
Hou, Wenjin ;
Khan, Salman ;
Khan, Fahad Shahbaz .
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, :23964-23974
[7]   FREE: Feature Refinement for Generalized Zero-Shot Learning [J].
Chen, Shiming ;
Wang, Wenjie ;
Xia, Beihao ;
Peng, Qinmu ;
You, Xinge ;
Zheng, Feng ;
Shao, Ling .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :122-131
[8]  
Cheng B, 2021, ADV NEUR IN, V34
[9]   Query-Guided Prototype Evolution Network for Few-Shot Segmentation [J].
Cong, Runmin ;
Xiong, Hang ;
Chen, Jinpeng ;
Zhang, Wei ;
Huang, Qingming ;
Zhao, Yao .
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 :6501-6512
[10]  
Dong N., 2018, BMVC, V3, P4