Decoupling Zero-Shot Semantic Segmentation

被引:87
|
作者
Ding, Jian [1 ,2 ]
Xue, Nan [1 ]
Xia, Gui-Song [1 ]
Dai, Dengxin [2 ]
机构
[1] Wuhan Univ, Captain, Wuhan, Peoples R China
[2] MPI Informat, Saarbrucken, Germany
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2022年
关键词
D O I
10.1109/CVPR52688.2022.01129
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Zero-shot semantic segmentation (ZS3) aims to segment the novel categories that have not been seen in the training. Existing works formulate ZS3 as a pixel-level zero-shot classification problem, and transfer semantic knowledge from seen classes to unseen ones with the help of language models pre-trained only with texts. While simple, the pixel-level ZS3 formulation shows the limited capability to integrate vision-language models that are often pre-trained with image-text pairs and currently demonstrate great potential for vision tasks. Inspired by the observation that humans often perform segment-level semantic labeling, we propose to decouple the ZS3 into two sub-tasks: 1) a class-agnostic grouping task to group the pixels into segments. 2) a zero-shot classification task on segments. The former task does not involve category information and can be directly transferred to group pixels for unseen classes. The latter task performs at segment-level and provides a natural way to leverage large-scale vision-language models pre-trained with image-text pairs (e.g. CLIP) for ZS3. Based on the decoupling formulation, we propose a simple and effective zero-shot semantic segmentation model, called ZegFormer, which outperforms the previous methods on ZS3 standard benchmarks by large margins, e.g., 22 points on the PAS-CAL VOC and 3 points on the COCO-Stuff in terms of mIoU for unseen classes. Code will be released at https://github.com/dingjiansw101/ZegFormer.
引用
收藏
页码:11573 / 11582
页数:10
相关论文
共 50 条
  • [1] Zero-Shot Semantic Segmentation
    Bucher, Maxime
    Vu, Tuan-Hung
    Cord, Matthieu
    Perez, Patrick
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [2] Decoupling Structure and Lexicon for Zero-Shot Semantic Parsing
    Herzig, Jonathan
    Berant, Jonathan
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1619 - 1629
  • [3] Recursive Training for Zero-Shot Semantic Segmentation
    Wang, Ce
    Farazi, Moshiur
    Barnes, Nick
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [4] A meaningful learning method for zero-shot semantic segmentation
    Liu, Xianglong
    Bai, Shihao
    An, Shan
    Wang, Shuo
    Liu, Wei
    Zhao, Xiaowei
    Ma, Yuqing
    SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (11)
  • [5] A meaningful learning method for zero-shot semantic segmentation
    Xianglong LIU
    Shihao BAI
    Shan AN
    Shuo WANG
    Wei LIU
    Xiaowei ZHAO
    Yuqing MA
    Science China(Information Sciences), 2023, 66 (11) : 35 - 53
  • [6] A meaningful learning method for zero-shot semantic segmentation
    Xianglong Liu
    Shihao Bai
    Shan An
    Shuo Wang
    Wei Liu
    Xiaowei Zhao
    Yuqing Ma
    Science China Information Sciences, 2023, 66
  • [7] Zero-shot Semantic Segmentation Using Relation Network
    Zhang, Yindong
    Khriyenko, Oleksiy
    PROCEEDINGS OF THE 28TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION FRUCT, 2021, : 516 - 527
  • [8] Exploring Zero-Shot Semantic Segmentation with No Supervision Leakage
    Wang, Yiqi
    Tian, Yingjie
    ELECTRONICS, 2023, 12 (16)
  • [9] Zero-Shot Semantic Segmentation via Variational Mapping
    Kato, Naoki
    Yamasaki, Toshihiko
    Aizawa, Kiyoharu
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1363 - 1370
  • [10] Few-Shot and Zero-Shot Semantic Segmentation for Food Images
    Honbu, Yuma
    Yanai, Keiji
    PROCEEDINGS OF THE 13TH INTERNATIONAL WORKSHOP ON MULTIMEDIA FOR COOKING AND EATING ACTIVITIES (CEA '21), 2021, : 25 - 28