CSP-DCPE: Category-Specific Prompt with Deep Contextual Prompt Enhancement for Vision-Language Models

被引:0
|
作者
Wu, Chunlei [1 ,2 ]
Wu, Yixiang [1 ,2 ]
Xu, Qinfu [1 ,2 ]
Zi, Xuebin [1 ,2 ]
机构
[1] China Univ Petr East China, Qingdao Inst Software, Coll Comp Sci & Technol, Qingdao 266580, Peoples R China
[2] China Univ Petr East China, Coll Comp Sci & Technol, Shandong Key Lab Intelligent Oil & Gas Ind Softwar, Qingdao 266580, Peoples R China
来源
ELECTRONICS | 2025年 / 14卷 / 04期
基金
中国国家自然科学基金;
关键词
image classification; pre-trained vision-language models; multi-modal; prompt learning;
D O I
10.3390/electronics14040673
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, prompt learning has emerged as a viable technique for fine-tuning pre-trained vision-language models (VLMs). The use of prompts allows pre-trained VLMs to be quickly adapted to specific downstream tasks, bypassing the necessity to update the original pre-trained weights. Nevertheless, much of the existing work on prompt learning has focused primarily on the utilization of non-specific prompts, with little attention paid to the category-specific data. In this paper, we present a novel method, the Category-Specific Prompt (CSP), which integrates task-oriented information into our model, thereby augmenting its capacity to comprehend and execute complex tasks. In order to enhance the exploitation of features, thereby optimizing the utilization of the combination of category-specific and non-specific prompts, we introduce a novel deep prompt-learning method, Deep Contextual Prompt Enhancement (DCPE). DCPE outputs features with rich text embedding knowledge that changes in response to input through attention-based interactions, thereby ensuring that our model contains instance-oriented information. Combining the above two methods, our architecture CSP-DCPE contains both task-oriented and instance-oriented information, and achieves state-of-the-art average scores on 11 benchmark image-classification datasets.
引用
收藏
页数:22
相关论文
共 44 条
  • [1] Learning to Prompt for Vision-Language Models
    Zhou, Kaiyang
    Yang, Jingkang
    Loy, Chen Change
    Liu, Ziwei
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (09) : 2337 - 2348
  • [2] Learning to Prompt for Vision-Language Models
    Kaiyang Zhou
    Jingkang Yang
    Chen Change Loy
    Ziwei Liu
    International Journal of Computer Vision, 2022, 130 : 2337 - 2348
  • [3] CoPL: Contextual Prompt Learning for Vision-Language Understanding
    Goswami, Koustava
    Karanam, Srikrishna
    Udhayanan, Prateksha
    Joseph, K. J.
    Srinivasan, Balaji Vasan
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 18090 - 18098
  • [4] Conditional Prompt Learning for Vision-Language Models
    Zhou, Kaiyang
    Yang, Jingkang
    Loy, Chen Change
    Liu, Ziwei
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16795 - 16804
  • [5] Consistent prompt learning for vision-language models
    Zhang, Yonggang
    Tian, Xinmei
    KNOWLEDGE-BASED SYSTEMS, 2025, 310
  • [6] Adversarial Prompt Tuning for Vision-Language Models
    Zhang, Jiaming
    Ma, Xingjun
    Wang, Xin
    Qiu, Lingyu
    Wang, Jiaqi
    Jiang, Yu-Gang
    Sang, Jitao
    COMPUTER VISION - ECCV 2024, PT XLV, 2025, 15103 : 56 - 72
  • [7] Learning Domain Invariant Prompt for Vision-Language Models
    Zhao, Cairong
    Wang, Yubin
    Jiang, Xinyang
    Shen, Yifei
    Song, Kaitao
    Li, Dongsheng
    Miao, Duoqian
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1348 - 1360
  • [8] DPO: Discrete Prompt Optimization for Vision-Language Models
    Liang, Nanhao
    Liu, Yong
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 671 - 675
  • [9] Category-Specific Prompts for Animal Action Recognition with Pretrained Vision-Language Models
    Jing, Yinuo
    Wang, Chunyu
    Zhang, Ruxu
    Liang, Kongming
    Ma, Zhanyu
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5716 - 5724
  • [10] JoAPR: Cleaning the Lens of Prompt Learning for Vision-Language Models
    Guo, Yuncheng
    Guo, Xiaodong
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 28695 - 28705