Fast Certification of Vision-Language Models Using Incremental Randomized Smoothing

被引：0

作者：

Nirala, Ashutosh ^{[1
]}

Joshi, Ameya ^{[2
]}

Sarkar, Soumik ^{[1
]}

Hegde, Chinmay ^{[2
]}

机构：

[1] Iowa State Univ, Ames, IA 50011 USA

[2] New York Univ, New York, NY USA

来源：

IEEE CONFERENCE ON SAFE AND TRUSTWORTHY MACHINE LEARNING, SATML 2024 | 2024年

基金：

美国国家科学基金会;

关键词：

Vision-language models; CLIP; certified robustness; randomized smoothing;

D O I：

10.1109/SaTML59370.2024.00019

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A key benefit of deep vision-language models such as CLIP is that they enable zero-shot open vocabulary classification; the user has the ability to define novel class labels via natural language prompts at inference time. However, while CLIP-based zero-shot classifiers have demonstrated competitive performance across a range of domain shifts, they remain highly vulnerable to adversarial attacks. Therefore, ensuring the robustness of such models is crucial for their reliable deployment in the wild. In this work, we introduce Open Vocabulary Certification (OVC), a fast certification method designed for open-vocabulary models like CLIP via randomized smoothing techniques. Given a base "training" set of prompts and their corresponding certified CLIP classifiers, OVC relies on the observation that a classifier with a novel prompt can be viewed as a perturbed version of nearby classifiers in the base training set. Therefore, OVC can rapidly certify the novel classifier using a variation of incremental randomized smoothing. By using a caching trick, we achieve approximately two orders of magnitude acceleration in the certification process for novel prompts. To achieve further (heuristic) speedups, OVC approximates the embedding space at a given input using a multivariate normal distribution bypassing the need for sampling via forward passes through the vision backbone. We demonstrate the effectiveness of OVC on through experimental evaluation using multiple vision-language backbones on the CIFAR-10 and ImageNet test datasets.

引用

页码：252 / 271

页数：20

共 50 条

[1] Vision-Language Models for Vision Tasks: A Survey
Zhang, Jingyi
Huang, Jiaxing
Jin, Sheng
Lu, Shijian
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (08) : 5625 - 5644
[2] Learning to Prompt for Vision-Language Models
Zhou, Kaiyang
Yang, Jingkang
Loy, Chen Change
Liu, Ziwei
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (09) : 2337 - 2348
[3] Vision-Language Models for Biomedical Applications
Thapa, Surendrabikram
Naseem, Usman
Zhou, Luping
Kim, Jinman
PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON VISION-LANGUAGE MODELS FOR BIOMEDICAL APPLICATIONS, VLM4BIO 2024, 2024, : 1 - 2
[4] Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
International Journal of Computer Vision, 2022, 130 : 2337 - 2348
[5] The Neglected Tails in Vision-Language Models
Parashar, Shubham
Lin, Zhiqiu
Liu, Tian
Dong, Xiangjue
Li, Yanan
Ramanan, Deva
Caverlee, James
Kong, Shu
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 12988 - 12997
[6] VISION-LANGUAGE MODELS AS SUCCESS DETECTORS
Du, Yuqing
Konyushkova, Ksenia
Denil, Misha
Raju, Akhil
Landon, Jessica
Hill, Felix
de Freitas, Nando
Cabi, Serkan
CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 232, 2023, 232 : 120 - 136
[7] Learning the Visualness of Text Using Large Vision-Language Models
Verma, Gaurav
Rossi, Ryan A.
Tensmeyer, Christopher
Gu, Jiuxiang
Nenkova, Ani
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 2394 - 2408
[8] Debiasing vision-language models for vision tasks: a survey
Zhu, Beier
Zhang, Hanwang
FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (01)
[9] Conceptual Codebook Learning for Vision-Language Models
Zhang, Yi
Yu, Ke
Wu, Siqi
He, Zhihai
COMPUTER VISION - ECCV 2024, PT LXXVII, 2024, 15135 : 235 - 251
[10] Unsupervised Prototype Adapter for Vision-Language Models
Zhang, Yi
Zhang, Ce
Hu, Xueting
He, Zhihai
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 197 - 209

← 1 2 3 4 5 →