TAG: Guidance-Free Open-Vocabulary Semantic Segmentation

被引：0

作者：

Kawano, Yasufumi ^{[1
]}

Aoki, Yoshimitsu ^{[1
]}

机构：

[1] Keio Univ, Grad Sch Integrated Design Engn, Yokohama, Kanagawa 2238522, Japan

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Semantic segmentation; Training; Databases; Annotations; Task analysis; Semantics; Vocabulary; Computer vision; Classification algorithms; open-vocabulary; zero-guidance;

D O I：

10.1109/ACCESS.2024.3418210

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Semantic segmentation is a crucial task in computer vision, where each pixel in an image is classified into a category. However, traditional methods face significant challenges, including the need for pixel-level annotations and extensive training. Furthermore, because supervised learning uses a limited set of predefined categories, models typically struggle with rare classes and cannot recognize new ones. Unsupervised and open-vocabulary segmentation, proposed to tackle these issues, faces challenges, including the inability to assign specific class labels to clusters and the necessity of user-provided text queries for guidance. In this context, we propose a novel approach, TAG which achieves Training, Annotation, and Guidance-free open-vocabulary semantic segmentation. TAG utilizes pre-trained models such as CLIP and DINO to segment images into meaningful categories without additional training or dense annotations. It retrieves class labels from an external database, providing flexibility to adapt to new scenarios. Our TAG achieves state-of-the-art results on PascalVOC, PascalContext and ADE20K for open-vocabulary segmentation without given class names, i.e. improvement of +15.3 mIoU on PascalVOC.

引用

页码：88322 / 88331

页数：10

共 50 条

[1] Class Enhancement Losses With Pseudo Labels for Open-Vocabulary Semantic Segmentation
Dao, Son Duy
Shi, Hengcan
Phung, Dinh
Cai, Jianfei
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8442 - 8453
[2] Enhancing Open-Vocabulary Semantic Segmentation with Prototype Retrieval
Barsellotti, Luca
Amoroso, Roberto
Baraldi, Lorenzo
Cucchiara, Rita
IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT II, 2023, 14234 : 196 - 208
[3] Image-text aggregation for open-vocabulary semantic segmentation
Cheng, Shengyang
Huang, Jianyong
Wang, Xiaodong
Huang, Lei
Wei, Zhiqiang
NEUROCOMPUTING, 2025, 630
[4] SAN: Side Adapter Network for Open-Vocabulary Semantic Segmentation
Xu, Mengde
Zhang, Zheng
Wei, Fangyun
Hu, Han
Bai, Xiang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15546 - 15561
[5] LLMFormer: Large Language Model for Open-Vocabulary Semantic Segmentation
Shi, Hengcan
Dao, Son Duy
Cai, Jianfei
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (02) : 742 - 759
[6] Open-Vocabulary RGB-Thermal Semantic Segmentation
Zhao, Guoqiang
Huang, Junjie
Yan, Xiaoyun
Wang, Zhaojing
Tang, Junwei
Ou, Yangjun
Hu, Xinrong
Peng, Tao
COMPUTER VISION - ECCV 2024, PT LXXIV, 2025, 15132 : 304 - 320
[7] Purify Then Guide: A Bi-Directional Bridge Network for Open-Vocabulary Semantic Segmentation
Pan, Yuwen
Sun, Rui
Wang, Yuan
Yang, Wenfei
Zhang, Tianzhu
Zhang, Yongdong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) : 343 - 356
[8] Generalization Boosted Adapter for Open-Vocabulary Segmentation
Xu, Wenhao
Wang, Changwei
Feng, Xuxiang
Xu, Rongtao
Huang, Longzhao
Zhang, Zherui
Guo, Li
Xu, Shibiao
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) : 520 - 533
[9] FreeMix: Open-Vocabulary Domain Generalization of Remote-Sensing Images for Semantic Segmentation
Wu, Jingyi
Shi, Jingye
Zhao, Zeyong
Liu, Ziyang
Zhi, Ruicong
REMOTE SENSING, 2025, 17 (08)
[10] Open-Vocabulary Camouflaged Object Segmentation
Pang, Youwei
Zhao, Xiaoqi
Zuo, Jiaming
Zhang, Lihe
Lu, Huchuan
COMPUTER VISION - ECCV 2024, PT XLVII, 2025, 15105 : 476 - 495

← 1 2 3 4 5 →