Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning

被引:13
|
作者
Feng, Chun-Mei [1 ]
Yu, Kai [1 ]
Liu, Yong [1 ]
Khan, Salman [2 ,3 ]
Zuo, Wangmeng [4 ]
机构
[1] ASTAR, Inst High Performance Comp IHPC, Singapore, Singapore
[2] Mohamed bin Zayed Univ Artificial Intelligence MB, Abu Dhabi, U Arab Emirates
[3] Australian Natl Univ, Canberra, ACT, Australia
[4] Harbin Inst Technol, Harbin, Peoples R China
来源
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV | 2023年
基金
新加坡国家研究基金会;
关键词
D O I
10.1109/ICCV51070.2023.00255
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Benefiting from prompt tuning, recent years have witnessed the promising performance of pre-trained vision-language models, e.g., CLIP, on versatile downstream tasks. In this paper, we focus on a particular setting of learning adaptive prompts on the fly for each test sample from an unseen new domain, which is known as test-time prompt tuning (TPT). Existing TPT methods typically rely on data augmentation and confidence selection. However, conventional data augmentation techniques, e.g., random resized crops, suffers from the lack of data diversity, while entropy-based confidence selection alone is not sufficient to guarantee prediction fidelity. To address these issues, we propose a novel TPT method, named DiffTPT, which leverages pre-trained diffusion models to generate diverse and informative new data. Specifically, we incorporate augmented data by both conventional method and pre-trained stable diffusion to exploit their respective merits, improving the model's ability to adapt to unknown new test data. Moreover, to ensure the prediction fidelity of generated data, we introduce a cosine similarity-based filtration technique to select the generated data with higher similarity to the single test sample. Our experiments on test datasets with distribution shifts and unseen categories demonstrate that DiffTPT improves the zero-shot accuracy by an average of 5.13% compared to the state-of-the-art TPT method.
引用
收藏
页码:2704 / 2714
页数:11
相关论文
共 50 条
  • [1] Robust Test-Time Adaptation for Zero-Shot Prompt Tuning
    Zhang, Ding-Chu
    Zhou, Zhi
    Li, Yu-Feng
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 15, 2024, : 16714 - 16722
  • [2] Boosting anomaly detection using unsupervised diverse test-time augmentation
    Cohen, Seffi
    Goldshlager, Niv
    Rokach, Lior
    Shapira, Bracha
    INFORMATION SCIENCES, 2023, 626 : 821 - 836
  • [3] CTPT: Continual Test-time Prompt Tuning for vision-language models
    Wang, Fan
    Han, Zhongyi
    Liu, Xingbo
    Yin, Yilong
    Gao, Xin
    PATTERN RECOGNITION, 2025, 161
  • [4] Test-time Augmentation for Factual Probing
    Kamoda, Go
    Heinzerling, Benjamin
    Sakaguchi, Keisuke
    Inui, Kentaro
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3650 - 3661
  • [5] Better Aggregation in Test-Time Augmentation
    Shanmugam, Divya
    Blalock, Davis
    Balakrishnan, Guha
    Guttag, John
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1194 - 1203
  • [6] Learning Loss for Test-Time Augmentation
    Kim, Ildoo
    Kim, Younghoon
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [7] Training- and Test-Time Data Augmentation for Hyperspectral Image Segmentation
    Nalepa, Jakub
    Myller, Michal
    Kawulok, Michal
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (02) : 292 - 296
  • [8] Adapting Vision-Language Models to Open Classes via Test-Time Prompt Tuning
    Gao, Zhengqing
    Ao, Xiang
    Zhang, Xu-Yao
    Liu, Cheng-Lin
    PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 439 - 452
  • [9] Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
    Shu, Manli
    Nie, Weili
    Huang, De-An
    Yu, Zhiding
    Goldstein, Tom
    Anandkumar, Anima
    Xiao, Chaowei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [10] Test-Time Augmentation for Traveling Salesperson Problem
    Ishiyama, Ryo
    Shirakawa, Takahiro
    Uchida, Seiichi
    Matsuo, Shinnosuke
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT I, 2024, 15016 : 194 - 208