TAG: Guidance-Free Open-Vocabulary Semantic Segmentation

被引:0
|
作者
Kawano, Yasufumi [1 ]
Aoki, Yoshimitsu [1 ]
机构
[1] Keio Univ, Grad Sch Integrated Design Engn, Yokohama, Kanagawa 2238522, Japan
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Semantic segmentation; Training; Databases; Annotations; Task analysis; Semantics; Vocabulary; Computer vision; Classification algorithms; open-vocabulary; zero-guidance;
D O I
10.1109/ACCESS.2024.3418210
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semantic segmentation is a crucial task in computer vision, where each pixel in an image is classified into a category. However, traditional methods face significant challenges, including the need for pixel-level annotations and extensive training. Furthermore, because supervised learning uses a limited set of predefined categories, models typically struggle with rare classes and cannot recognize new ones. Unsupervised and open-vocabulary segmentation, proposed to tackle these issues, faces challenges, including the inability to assign specific class labels to clusters and the necessity of user-provided text queries for guidance. In this context, we propose a novel approach, TAG which achieves Training, Annotation, and Guidance-free open-vocabulary semantic segmentation. TAG utilizes pre-trained models such as CLIP and DINO to segment images into meaningful categories without additional training or dense annotations. It retrieves class labels from an external database, providing flexibility to adapt to new scenarios. Our TAG achieves state-of-the-art results on PascalVOC, PascalContext and ADE20K for open-vocabulary segmentation without given class names, i.e. improvement of +15.3 mIoU on PascalVOC.
引用
收藏
页码:88322 / 88331
页数:10
相关论文
共 50 条
  • [21] CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
    Zhu, Wenqi
    Cao, Jiale
    Xie, Jin
    Yang, Shuangming
    Pang, Yanwei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (02) : 1098 - 1110
  • [22] Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation
    Fang, Hao
    Wu, Peng
    Li, Yawei
    Zhang, Xinxin
    Lu, Xiankai
    COMPUTER VISION - ECCV 2024, PT LXX, 2025, 15128 : 225 - 241
  • [23] OV-VIS: Open-Vocabulary Video Instance Segmentation
    Wang, Haochen
    Yan, Cilin
    Chen, Keyan
    Jiang, Xiaolong
    Tang, Xu
    Hu, Yao
    Kang, Guoliang
    Xie, Weidi
    Gavves, Efstratios
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (11) : 5048 - 5065
  • [24] Expanding the Horizons: Exploring Further Steps in Open-Vocabulary Segmentation
    Wang, Xihua
    Ji, Lei
    Yan, Kun
    Sun, Yuchong
    Song, Ruihua
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT X, 2024, 14434 : 407 - 419
  • [25] Open-Vocabulary Animal Keypoint Detection with Semantic-Feature Matching
    Zhang, Hao
    Xu, Lumin
    Lai, Shenqi
    Shao, Wenqi
    Zheng, Nanning
    Luo, Ping
    Qiao, Yu
    Zhang, Kaipeng
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (12) : 5741 - 5758
  • [26] Subword-Based Compact Reconstruction for Open-Vocabulary Neural Word Embeddings
    Sasaki, Shota
    Suzuki, Jun
    Inui, Kentaro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3551 - 3564
  • [27] Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection
    Xu, Yifan
    Zhang, Mengdan
    Yang, Xiaoshan
    Xu, Changsheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 6253 - 6267
  • [28] Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation
    Shao, Tong
    Tian, Zhuotao
    Zhao, Hang
    Su, Jingyong
    COMPUTER VISION - ECCV 2024, PT LXXXVI, 2025, 15144 : 139 - 156
  • [29] Depth Guidance and Intradomain Adaptation for Semantic Segmentation
    Lu, Jiawen
    Shi, Jinlong
    Zhu, Haowei
    Ni, Jun
    Shu, Xin
    Sun, Yunhan
    Cheng, Zhigang
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [30] Exploration of an Open Vocabulary Model on Semantic Segmentation for Street Scene Imagery
    Zeng, Zichao
    Boehm, Jan
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2024, 13 (05)