P-Distill: Efficient and Effective Prompt Tuning Using Knowledge Distillation

被引：0

作者：

Won, Hyun-Sik ^{[1
]}

Choi, Joon-Young ^{[2
]}

Zaman, Namrah ^{[1
]}

Aliyeva, Dinara ^{[3
]}

Kim, Kang-Min ^{[1
,4
]}

机构：

[1] Catholic Univ Korea, Dept Artificial Intelligence, Bucheon 14662, South Korea

[2] Danggeun Market Inc, Seoul 06611, South Korea

[3] Univ North Carolina Chapel Hill, Coll Arts & Sci, Dept Comp Sci, Chapel Hill, NC 27599 USA

[4] Catholic Univ Korea, Dept Data Sci, Bucheon 14662, South Korea

来源：

APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 05期

基金：

新加坡国家研究基金会;

关键词：

knowledge distillation; natural language processing; natural language understanding; pre-trained language models; prompt compression; prompt engineering; prompt tuning; P-tuning v2;

D O I：

10.3390/app15052420

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

In the field of natural language processing (NLP), prompt-based learning is widely used for efficient parameter learning. However, this method has the drawback of shortening the input length by the extent of the attached prompt, leading to the inefficient utilization of the input space. In this study, we propose P-Distill, a novel prompt compression method that mitigates the aforementioned limitation of prompt-based learning while maintaining performance via knowledge distillation. The knowledge distillation process of P-Distill consists of two methods, namely prompt initialization and prompt distillation. Experiments on various NLP tasks demonstrated that P-Distill exhibited comparable or superior performance compared to other state-of-the-art prompt-based learning methods, even with significantly shorter prompts. Specifically, we achieved a peak improvement of 1.90% even with the prompt lengths compressed to one-eighth. An additional study further provides insights into the distinct impact of each method on the overall performance of P-Distill. Our code will be released upon acceptance.

引用

页数：17

共 33 条

[1] Extract then Distill: Efficient and Effective Task-Agnostic BERT Distillation
Chen, Cheng
Yin, Yichun
Shang, Lifeng
Wang, Zhi
Jiang, Xin
Chen, Xiao
Liu, Qun
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT III, 2021, 12893 : 570 - 581
[2] PanDa: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation
Zhong, Qihuang
Ding, Liang
Liu, Juhua
Du, Bo
Tao, Dacheng
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (09) : 4835 - 4848
[3] Harnessing the Power of Prompt Experts: Efficient Knowledge Distillation for Enhanced Language Understanding
Meng, Xv
Rao, Jun
Qi, Shuhan
Wang, Lei
Xiao, Jing
Wang, Xuan
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-RESEARCH TRACK AND DEMO TRACK, PT VIII, ECML PKDD 2024, 2024, 14948 : 218 - 234
[4] E2VPT: An Effective and Efficient Approach for Visual Prompt Tuning
Han, Cheng
Wang, Qifan
Cui, Yiming
Cao, Zhiwen
Wang, Wenguan
Qi, Siyuan
Liu, Dongfang
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17445 - 17456
[5] SMoP: Towards Efficient and Effective Prompt Tuning with Sparse Mixture-of-Prompts
Choi, Joon-Young
Kim, Junho
Park, Jun-Hyung
Mok, Wing-Lam
Lee, SangKeun
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 14306 - 14316
[6] Efficient Federated Learning for AIoT Applications Using Knowledge Distillation
Liu, Tian
Xia, Jun
Ling, Zhiwei
Fu, Xin
Yu, Shui
Chen, Mingsong
IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (08) : 7229 - 7243
[7] Effective and efficient conditional contrast for data-free knowledge distillation with low memory
Jiang, Chenyang
Li, Zhendong
Yang, Jun
Wu, Yiqiang
Li, Shuai
JOURNAL OF SUPERCOMPUTING, 2025, 81 (04):
[8] Data-Efficient Sensor Upgrade Path Using Knowledge Distillation
Van Molle, Pieter
De Boom, Cedric
Verbelen, Tim
Vankeirsbilck, Bert
De Vylder, Jonas
Diricx, Bart
Simoens, Pieter
Dhoedt, Bart
SENSORS, 2021, 21 (19)
[9] AN EFFICIENT METHOD FOR MODEL PRUNING USING KNOWLEDGE DISTILLATION WITH FEW SAMPLES
Zhou, ZhaoJing
Zhou, Yun
Jiang, Zhuqing
Men, Aidong
Wang, Haiying
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2515 - 2519
[10] NAYER: Noisy Layer Data Generation for Efficient and Effective Data-free Knowledge Distillation
Tran, Minh-Tuan
Le, Trung
Le, Xuan-May
Harandi, Mehrtash
Tran, Quan Hung
Phung, Dinh
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 23860 - 23869

← 1 2 3 4 →