Learning Meta Soft Prompt for Few-Shot Language Models

被引:5
作者
Chien, Jen-Tzung [1 ]
Chen, Ming-Yen [1 ]
Xue, Jing-Hao [2 ]
机构
[1] Natl Yang Ming Chiao Tung Univ, Inst Elect & Comp Engn, Hsinchu, Taiwan
[2] UCL, Dept Stat Sci, London, England
来源
2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC | 2023年
关键词
Meta learning; few-shot learning; prompt tuning; domain adaptation; language model;
D O I
10.1109/APSIPAASC58517.2023.10317500
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prompt-based learning is powerful to utilize the large-scaled pre-trained language model (PLM) for language understanding where the input sentences are augmented by either adding the hard prompt using word tokens or the soft prompt in a form of trainable tokens. However, the learned soft prompt in training domain may not really help a frozen PLM to handle domain shift in test domain. This paper presents an approach to incorporate meta learning into domain adaptation to train new soft prompt which sufficiently generalizes the frozen PLM to a number of domains. The meta soft prompt is then developed for few-shot unsupervised domain adaptation where a frozen PLM can be quickly adapted to a target domain. This soft prompt is optimized according to meta learning where the domain adaptation loss and the prompt-based classification loss are jointly minimized. The experiments on multi-domain natural language understanding show the benefits of the proposed meta soft prompt in pre-trained language model by using BERT under the few-shot setting.
引用
收藏
页码:57 / 62
页数:6
相关论文
共 33 条
[1]  
Brown TB, 2020, ADV NEUR IN, V33
[2]  
Chen M.-Y., 2023, P IEEE INT C AC SPEE, P1
[3]  
Chen Xilun., 2018, P 2018 C N AM CHAPT, V1, P1226, DOI 10.18653/v1/n18-1111
[4]  
Chien J.-T., 2019, P EUR SIGN PROC C, P1
[5]  
Chien J.-T., 2019, P ANN M ASS COMP LIN, P25
[6]   Variational Skill Embeddings for Meta Reinforcement Learning [J].
Chien, Jen-Tzung ;
Lai, Weiwei .
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[7]   Meta Learning for Hyperparameter Optimization in Dialogue System [J].
Chien, Jen-Tzung ;
Lieow, Wei Xiang .
INTERSPEECH 2019, 2019, :839-843
[8]   Hierarchical Pitman-Yor-Dirichlet Language Model [J].
Chien, Jen-Tzung .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (08) :1259-1272
[9]  
Chien JT, 2008, 2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, P201
[10]  
Chien Jen-Tzung, 2020, P INT JOINT C NEUR N, P1