Vision-Language Models for Zero-Shot Classification of Remote Sensing Images

被引：6

作者：

Al Rahhal, Mohamad Mahmoud ^{[1
]}

Bazi, Yakoub ^{[2
]}

Elgibreen, Hebah ^{[3
]}

Zuair, Mansour ^{[2
]}

机构：

[1] King Saud Univ, Coll Appl Comp Sci, Appl Comp Sci Dept, Riyadh 11543, Saudi Arabia

[2] King Saud Univ, Comp Engn Dept, Coll Comp & Informat Sci, Riyadh 11543, Saudi Arabia

[3] Coll Comp & Informat Sci, Informat Technol Dept, Riyadh 11543, Saudi Arabia

来源：

APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 22期

关键词：

Contrastive Language-Image Pre-Training models; remote sensing; zero-shot classification;

D O I：

10.3390/app132212462

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Zero-shot classification presents a challenge since it necessitates a model to categorize images belonging to classes it has not encountered during its training phase. Previous research in the field of remote sensing (RS) has explored this task by training image-based models on known RS classes and then attempting to predict the outcomes for unfamiliar classes. Despite these endeavors, the outcomes have proven to be less than satisfactory. In this paper, we propose an alternative approach that leverages vision-language models (VLMs), which have undergone pre-training to grasp the associations between general computer vision image-text pairs in diverse datasets. Specifically, our investigation focuses on thirteen VLMs derived from Contrastive Language-Image Pre-Training (CLIP/Open-CLIP) with varying levels of parameter complexity. In our experiments, we ascertain the most suitable prompt for RS images to query the language capabilities of the VLM. Furthermore, we demonstrate that the accuracy of zero-shot classification, particularly when using large CLIP models, on three widely recognized RS scene datasets yields superior results compared to existing RS solutions.

引用

页数：16

共 50 条

[1] Label Propagation for Zero-shot Classification with Vision-Language Models
Stojnic, Vladan
Kalantidis, Yannis
Tolias, Giorgos
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 23209 - 23218
[2] Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
Zheng, Zangwei
Ma, Mingyuan
Wang, Kai
Qin, Ziheng
Yue, Xiangyu
You, Yang
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19068 - 19079
[3] MLTU: mixup long-tail unsupervised zero-shot image classification on vision-language models
Jia, Yunpeng
Ye, Xiufen
Mei, Xinkui
Liu, Yusong
Guo, Shuxiang
MULTIMEDIA SYSTEMS, 2024, 30 (03)
[4] Inference Calibration of Vision-Language Foundation Models for Zero-Shot and Few-Shot Learning
Hu, Minyang
Chang, Hong
Shan, Shiguang
Chen, Xilin
PATTERN RECOGNITION LETTERS, 2025, 192 : 15 - 21
[5] Multiple Prompt Fusion for Zero-Shot Lesion Detection Using Vision-Language Models
Guo, Miaotian
Yi, Huahui
Qin, Ziyuan
Wang, Haiying
Men, Aidong
Lao, Qicheng
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT V, 2023, 14224 : 283 - 292
[6] Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Shu, Manli
Nie, Weili
Huang, De-An
Yu, Zhiding
Goldstein, Tom
Anandkumar, Anima
Xiao, Chaowei
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[7] A Hybrid Embedding for Generalized Zero-Shot Scene Classification in Remote Sensing Images
Rambabu, Damalla
Datla, Rajeshreddy
Chalavadi, Vishnu
Mohan, C. Krishna
2024 IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, AVSS 2024, 2024,
[8] Zero-Shot Scene Classification for High Spatial Resolution Remote Sensing Images
Li, Aoxue
Lu, Zhiwu
Wang, Liwei
Xiang, Tao
Wen, Ji-Rong
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2017, 55 (07): : 4157 - 4167
[9] VLPSR: Enhancing Zero-Shot Object ReID with Vision-Language Model
Hu, Mingzhe
ADVANCES IN VISUAL COMPUTING, ISVC 2024, PT II, 2025, 15047 : 56 - 69
[10] CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment
Javed, Sajid
Mahmood, Arif
Ganapathil, Iyyakutti Iyappan
Dharej, Fayaz Ali
Werghil, Naoufel
Bennamoun, Mohammed
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 11450 - 11459

← 1 2 3 4 5 →