Word vector embedding and self-supplementing network for Generalized Few-shot Semantic Segmentation

被引:0
作者
Wang, Xiaowei [1 ]
Chen, Qiong [1 ]
Yang, Yong [1 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Generalized Few-shot Semantic Segmentation; Inter-class interference; Semantic word embedding; Self-supplementing;
D O I
10.1016/j.neucom.2024.128737
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Under the condition of sufficient base class samples and a few novel class samples, Generalized Few- shot Semantic Segmentation (GFSS) classifies each pixel in the query image as base class, novel class, or background. A standard GFSS approach involves two training stages: base class learning and novel class updating. However, inter-class interference and information loss which contribute to the poor performance of GFSS, have not been synthetical considered. To address the problem, we propose an Embedded-Self- Supplementing Network (ESSNet), i.e., semantic word embedding and query set self-supplementing information to enhance segmentation accuracy. Specifically, the semantic word embedding module employs distance information between word vectors to assist the model in learning the distance between class prototypes. In order to transform the semantic word vector prototypes from the semantic space to the visual embedding space, we designed a triplet loss function to supervise the word vector embedding module, where the word vector prototype serves as an anchor and positive-negative samples are collected among the general features of the support image. To compensate for the information loss caused by using prototypes to represent classes, we propose a self-supplementing module to mine the information contained in the query image. Specifically, this module first makes a preliminary prediction on the query image, then selects high-confidence area to form pseudo labels, and finally uses pseudo labels to extract query prototypes to supplement the missing information. Extensive experiments on PASCAL-5(i) and COCO-20(i) show that ESSNet has superior performance and outperforms state-of-the-art methods in all settings.
引用
收藏
页数:10
相关论文
共 46 条
[21]  
Liang C, 2022, ADV NEUR IN
[22]   Local-Global Context Aware Transformer for Language-Guided Video Segmentation [J].
Liang, Chen ;
Wang, Wenguan ;
Zhou, Tianfei ;
Miao, Jiaxu ;
Luo, Yawei ;
Yang, Yi .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) :10055-10069
[23]   Microsoft COCO: Common Objects in Context [J].
Lin, Tsung-Yi ;
Maire, Michael ;
Belongie, Serge ;
Hays, James ;
Perona, Pietro ;
Ramanan, Deva ;
Dollar, Piotr ;
Zitnick, C. Lawrence .
COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755
[24]   Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning [J].
Liu, Man ;
Li, Feng ;
Zhang, Chunjie ;
Wei, Yunchao ;
Bai, Huihui ;
Zhao, Yao .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :15337-15346
[25]   Learning Orthogonal Prototypes for Generalized Few-shot Semantic Segmentation [J].
Liu, Sun-Ao ;
Zhang, Yiheng ;
Qiu, Zhaofan ;
Xie, Hongtao ;
Zhang, Yongdong ;
Yao, Ting .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :11319-11328
[26]   Harmonizing Base and Novel Classes: A Class-Contrastive Approach for Generalized Few-Shot Segmentation [J].
Liu, Weide ;
Wu, Zhonghua ;
Zhao, Yang ;
Fang, Yuming ;
Foo, Chuan-Sheng ;
Cheng, Jun ;
Lin, Guosheng .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (04) :1277-1291
[27]   CRNet: Cross-Reference Networks for Few-Shot Segmentation [J].
Liu, Weide ;
Zhang, Chi ;
Lin, Guosheng ;
Liu, Fayao .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4164-4172
[28]   Segmenting Objects From Relational Visual Data [J].
Lu, Xiankai ;
Wang, Wenguan ;
Shen, Jianbing ;
Crandall, David J. ;
Van Gool, Luc .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) :7885-7897
[29]  
Mikolov T., 2013, ADV NEURAL INFORM PR, P26, DOI https://doi.org/10.48550/arXiv.1310.4546
[30]   MSI: Maximize Support-Set Information for Few-Shot Segmentation [J].
Moon, Seonghyeon ;
Sohn, Samuel S. ;
Zhou, Honglu ;
Yoon, Sejong ;
Pavlovic, Vladimir ;
Khan, Muhammad Haris ;
Kapadia, Mubbasir .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, :19209-19219