Pro-Tuning: Unified Prompt Tuning for Vision Tasks

被引:7
作者
Nie, Xing [1 ,2 ]
Ni, Bolin [1 ,2 ]
Chang, Jianlong [3 ]
Meng, Gaofeng [1 ,2 ,4 ]
Huo, Chunlei [5 ,6 ]
Xiang, Shiming [1 ,2 ]
Tian, Qi [3 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[3] Huawei Cloud & AI, Beijing 100095, Peoples R China
[4] HK Inst Sci & Innovat, CAS Ctr Artificial Intelligence & Robot, Hong Kong, Peoples R China
[5] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
[6] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
关键词
Task analysis; Adaptation models; Tuning; Computational modeling; Transformers; Visualization; Training; Prompt-based learning; representation learning; task-specific knowledge; transfer learning;
D O I
10.1109/TCSVT.2023.3327605
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In computer vision, fine-tuning is the de-facto approach to leverage pre-trained vision models to perform downstream tasks. However, deploying it in practice is quite challenging, due to adopting parameter inefficient global update and heavily relying on high-quality downstream data. Recently, prompt-based learning, which adds the task-relevant prompt to adapt the pre-trained models to downstream tasks, has drastically boosted the performance of many natural language downstream tasks. In this work, we extend this notable transfer ability benefited from prompt into vision models as an alternative to fine-tuning. To this end, we propose parameter-efficient Prompt tuning (Pro-tuning) to adapt diverse frozen pre-trained models to a wide variety of downstream vision tasks. The key to Pro-tuning is prompt-based tuning, i.e., learning task-specific vision prompts for downstream input images with the pre-trained model frozen. By only training a small number of additional parameters, Pro-tuning can generate compact and robust downstream models both for CNN-based and transformer-based network architectures. Comprehensive experiments evidence that the proposed Pro-tuning outperforms fine-tuning on a broad range of vision tasks and scenarios, including image classification (under generic objects, class imbalance, image corruption, natural adversarial examples, and out-of-distribution generalization), and dense prediction tasks such as object detection and semantic segmentation.
引用
收藏
页码:4653 / 4667
页数:15
相关论文
共 38 条
  • [21] Prompt Tuning on Graph-Augmented Low-Resource Text Classification
    Wen, Zhihao
    Fang, Yuan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (12) : 9080 - 9095
  • [22] SciDeBERTa: Learning DeBERTa for Science Technology Documents and Fine-Tuning Information Extraction Tasks
    Jeong, Yuna
    Kim, Eunhui
    IEEE ACCESS, 2022, 10 : 60805 - 60813
  • [23] PT-Tuning: Bridging the Gap between Time Series Masked Reconstruction and Forecasting via Prompt Token Tuning
    Liu, Hao
    Gan, Jinrui
    Fan, Xiaoxuan
    Zhang, Yi
    Luo, Chuanxian
    Zhang, Jing
    Jiang, Guangxin
    Qian, Yucheng
    Zhao, Changwei
    Ma, Huan
    Guo, Zhenyu
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 2, 2025, 14851 : 147 - 162
  • [24] Artificial fine-tuning tasks for yes/no question answering
    Dimitriadis, Dimitris
    Tsoumakas, Grigorios
    NATURAL LANGUAGE ENGINEERING, 2024, 30 (01) : 73 - 95
  • [25] Build a Good Human-Free Prompt Tuning: Jointly Pre-Trained Template and Verbalizer for Few-Shot Classification
    Chen, Mouxiang
    Fu, Han
    Liu, Chenghao
    Wang, Xiaoyun Joy
    Li, Zhuo
    Sun, Jianling
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (05) : 2253 - 2265
  • [26] ContrastNER: Contrastive-based Prompt Tuning for Few-shot NER
    Layegh, Amirhossein
    Payberah, Amir H.
    Soylu, Ahmet
    Roman, Dumitru
    Matskin, Mihhail
    2023 IEEE 47TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC, 2023, : 241 - 249
  • [27] Adaptive Pre-Training and Collaborative Fine-Tuning: A Win-Win Strategy to Improve Review Analysis Tasks
    Mao, Qianren
    Li, Jianxin
    Lin, Chenghua
    Chen, Congwen
    Peng, Hao
    Wang, Lihong
    Yu, Philip S.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 622 - 634
  • [28] Multi-resolution Fine-Tuning of Vision Transformers
    Fitzgerald, Kerr
    Law, Meng
    Seah, Jarrel
    Tang, Jennifer
    Matuszewski, Bogdan
    MEDICAL IMAGE UNDERSTANDING AND ANALYSIS, MIUA 2022, 2022, 13413 : 535 - 546
  • [29] Multi-Stage Prompt Tuning for Political Perspective Detection in Low-Resource Settings
    Kim, Kang-Min
    Lee, Mingyu
    Won, Hyun-Sik
    Kim, Min-Ji
    Kim, Yeachan
    Lee, SangKeun
    APPLIED SCIENCES-BASEL, 2023, 13 (10):
  • [30] Fine-tuning vision foundation model for crack segmentation in civil infrastructures
    Ge, K.
    Wang, C.
    Guo, Y. T.
    Tang, Y. S.
    Hu, Z. Z.
    Chen, H. B.
    CONSTRUCTION AND BUILDING MATERIALS, 2024, 431