Decoupling Zero-Shot Semantic Segmentation

被引：87

作者：

Ding, Jian ^{[1
,2
]}

Xue, Nan ^{[1
]}

Xia, Gui-Song ^{[1
]}

Dai, Dengxin ^{[2
]}

机构：

[1] Wuhan Univ, Captain, Wuhan, Peoples R China

[2] MPI Informat, Saarbrucken, Germany

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2022年

关键词：

D O I：

10.1109/CVPR52688.2022.01129

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Zero-shot semantic segmentation (ZS3) aims to segment the novel categories that have not been seen in the training. Existing works formulate ZS3 as a pixel-level zero-shot classification problem, and transfer semantic knowledge from seen classes to unseen ones with the help of language models pre-trained only with texts. While simple, the pixel-level ZS3 formulation shows the limited capability to integrate vision-language models that are often pre-trained with image-text pairs and currently demonstrate great potential for vision tasks. Inspired by the observation that humans often perform segment-level semantic labeling, we propose to decouple the ZS3 into two sub-tasks: 1) a class-agnostic grouping task to group the pixels into segments. 2) a zero-shot classification task on segments. The former task does not involve category information and can be directly transferred to group pixels for unseen classes. The latter task performs at segment-level and provides a natural way to leverage large-scale vision-language models pre-trained with image-text pairs (e.g. CLIP) for ZS3. Based on the decoupling formulation, we propose a simple and effective zero-shot semantic segmentation model, called ZegFormer, which outperforms the previous methods on ZS3 standard benchmarks by large margins, e.g., 22 points on the PAS-CAL VOC and 3 points on the COCO-Stuff in terms of mIoU for unseen classes. Code will be released at https://github.com/dingjiansw101/ZegFormer.

引用

页码：11573 / 11582

页数：10

共 50 条

[1] Zero-Shot Semantic Segmentation
Bucher, Maxime
Vu, Tuan-Hung
Cord, Matthieu
Perez, Patrick
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[2] Decoupling Structure and Lexicon for Zero-Shot Semantic Parsing
Herzig, Jonathan
Berant, Jonathan
2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1619 - 1629
[3] Recursive Training for Zero-Shot Semantic Segmentation
Wang, Ce
Farazi, Moshiur
Barnes, Nick
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[4] A meaningful learning method for zero-shot semantic segmentation
Liu, Xianglong
Bai, Shihao
An, Shan
Wang, Shuo
Liu, Wei
Zhao, Xiaowei
Ma, Yuqing
SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (11)
[5] A meaningful learning method for zero-shot semantic segmentation
Xianglong LIU
Shihao BAI
Shan AN
Shuo WANG
Wei LIU
Xiaowei ZHAO
Yuqing MA
Science China(Information Sciences), 2023, 66 (11) : 35 - 53
[6] A meaningful learning method for zero-shot semantic segmentation
Xianglong Liu
Shihao Bai
Shan An
Shuo Wang
Wei Liu
Xiaowei Zhao
Yuqing Ma
Science China Information Sciences, 2023, 66
[7] Zero-shot Semantic Segmentation Using Relation Network
Zhang, Yindong
Khriyenko, Oleksiy
PROCEEDINGS OF THE 28TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION FRUCT, 2021, : 516 - 527
[8] Exploring Zero-Shot Semantic Segmentation with No Supervision Leakage
Wang, Yiqi
Tian, Yingjie
ELECTRONICS, 2023, 12 (16)
[9] Zero-Shot Semantic Segmentation via Variational Mapping
Kato, Naoki
Yamasaki, Toshihiko
Aizawa, Kiyoharu
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1363 - 1370
[10] Few-Shot and Zero-Shot Semantic Segmentation for Food Images
Honbu, Yuma
Yanai, Keiji
PROCEEDINGS OF THE 13TH INTERNATIONAL WORKSHOP ON MULTIMEDIA FOR COOKING AND EATING ACTIVITIES (CEA '21), 2021, : 25 - 28

← 1 2 3 4 5 →