EXPLORING SOFT PROMPT INITIALIZATION STRATEGY FOR FEW-SHOT CONTINUAL TEXT CLASSIFICATION

被引:1
作者
Zhang, Zhehao [1 ]
Yu, Tong [2 ]
Zhao, Handong [2 ]
Xie, Kaige [3 ]
Yao, Lina [4 ,5 ]
Li, Shuai [6 ]
机构
[1] Dartmouth Coll, Hanover, NH 03755 USA
[2] Adobe Res, Waltham, MA 02451 USA
[3] Georgia Inst Technol, Atlanta, GA 30332 USA
[4] CSIRO, Data61, Eveleigh, Australia
[5] Univ New South Wales, Sydney, NSW, Australia
[6] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
来源
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024) | 2024年
关键词
Prompt-tunning; continual learning; text classification; few-shot learning; prompt initialization;
D O I
10.1109/ICASSP48485.2024.10448063
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Few-shot continual learning (FSCL) is a challenging setting as it requires models to learn new knowledge with a few examples over time, and fast adapt to new tasks without forgetting previous knowledge. Prompt-tuning, as an efficient learning approach for language models, has shown competitive performance in data-efficient learning for various NLP tasks, motivating us to explore how to effectively perform prompt-tuning in FSCL for text classification. In this work, we focus on studying prompt-tuning for continual classification, aiming to alleviate catastrophic forgetting and improve knowledge transfer with few-shot data in FSCL. After carefully analyzing the limited representation capability of existing soft-prompt initialization methods, we propose Task-Aware Initialization (TAI), a novel initialization approach that can combine the information from both context and label space. Extensive experiments with different language models including recent instruction-finetuned LLM in two FSCL settings (shot-invariant and shot-variant) demonstrate the superiority of TAI over current approaches.
引用
收藏
页码:12106 / 12110
页数:5
相关论文
共 24 条
[1]  
Chung Hyung Won, 2022, Scaling instruction-finetuned language models
[2]   ENHANCING CLASS UNDERSTANDING VIA PROMPT-TUNING FOR ZERO-SHOT TEXT CLASSIFICATION [J].
Dan, Yuhao ;
Zhou, Jie ;
Chen, Qin ;
Bai, Qingchun ;
He, Liang .
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, :4303-4307
[3]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[4]  
Hendrycks D., 2020, ARXIV
[5]  
Hu SD, 2022, PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), P2225
[6]  
Huang YF, 2021, 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), P2736
[7]   BI-D1870 Induces Mitotic Dysfunction and Apoptosis in Neuroblastoma by Regulating the PI3K-Akt-mTORC1 Signal Axis [J].
Jin, Liming ;
Mi, Tao ;
Wu, Xin ;
Wang, Zhang ;
Zhang, Zhaoxia ;
Liu, Jiayan ;
Wang, Zhaoying ;
Wang, Jinkui ;
Li, Mujie ;
Ren, Chunnian ;
Guo, Peng ;
He, Dawei .
CANCERS, 2023, 15 (07)
[8]  
Lee E., 2021, P IEEE CVF INT C COM, P9455
[9]  
Lehmann Jens, 2014, SEMANTIC WEB J, V6
[10]  
Lester B, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P3045