Label Propagation for Zero-shot Classification with Vision-Language Models

被引:4
作者
Stojnic, Vladan [1 ]
Kalantidis, Yannis [2 ]
Tolias, Giorgos [1 ]
机构
[1] Czech Tech Univ, FEE, VRG, Prague, Czech Republic
[2] NAVER LABS Europe, Meylan, France
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年
关键词
D O I
10.1109/CVPR52733.2024.02190
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vision-Language Models (VLMs) have demonstrated impressive performance on zero-shot classification, i.e. classification when provided merely with a list of class names. In this paper, we tackle the case of zero-shot classification in the presence of unlabeled data. We leverage the graph structure of the unlabeled data and introduce ZLaP, a method based on label propagation (LP) that utilizes geodesic distances for classification. We tailor LP to graphs containing both text and image features and further propose an efficient method for performing inductive inference based on a dual solution and a sparsification step. We perform extensive experiments to evaluate the effectiveness of our method on 14 common datasets and show that ZLaP outperforms the latest related works. Code: https://github.com/vladan-stojnic/ZLaP
引用
收藏
页码:23209 / 23218
页数:10
相关论文
共 50 条
[31]   UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding [J].
Sun, Rui ;
Wang, Zhecan ;
You, Haoxuan ;
Codella, Noel ;
Chang, Kai-Wei ;
Chang, Shih-Fu .
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, :778-793
[32]   On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning? [J].
Zanella, Maxime ;
Ben Ayed, Ismail .
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, :23783-23793
[33]   Towards zero-shot human-object interaction detection via vision-language integration [J].
Xue, Weiying ;
Liu, Qi ;
Wang, Yuxiao ;
Wei, Zhenao ;
Xing, Xiaofen ;
Xu, Xiangmin .
NEURAL NETWORKS, 2025, 187
[34]   Label Augmentation for Zero-Shot Hierarchical Text Classification [J].
Paletto, Lorenzo ;
Basile, Valerio ;
Esposito, Roberto .
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, :7697-7706
[35]   A Joint Label Space for Generalized Zero-Shot Classification [J].
Li, Jin ;
Lan, Xuguang ;
Long, Yang ;
Liu, Yang ;
Chen, Xingyu ;
Shao, Ling ;
Zheng, Nanning .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :5817-5831
[36]   Large Language Models are Zero-Shot Reasoners [J].
Kojima, Takeshi ;
Gu, Shixiang Shane ;
Reid, Machel ;
Matsuo, Yutaka ;
Iwasawa, Yusuke .
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[37]   Language Models as Zero-Shot Trajectory Generators [J].
Kwon, Teyun ;
Di Palo, Norman ;
Johns, Edward .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (07) :6728-6735
[38]   Analyzing Diagnostic Reasoning of Vision-Language Models via Zero-Shot Chain-of-Thought Prompting in Medical Visual Question Answering [J].
Faria, Fatema Tuj Johora ;
Baniata, Laith H. ;
Choi, Ahyoung ;
Kang, Sangwoo .
MATHEMATICS, 2025, 13 (14)
[39]   Zero-Shot Facial Expression Recognition with Multi-label Label Propagation [J].
Lu, Zijia ;
Zeng, Jiabei ;
Shan, Shiguang ;
Chen, Xilin .
COMPUTER VISION - ACCV 2018, PT III, 2019, 11363 :19-34
[40]   Few-Shot Image Classification of Crop Diseases Based on Vision-Language Models [J].
Zhou, Yueyue ;
Yan, Hongping ;
Ding, Kun ;
Cai, Tingting ;
Zhang, Yan .
SENSORS, 2024, 24 (18)