PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains

被引:31
作者
Ben-David, Eyal [1 ]
Oved, Nadav [1 ]
Reichart, Roi [1 ]
机构
[1] Technion Israel Inst Technol, Haifa, Israel
关键词
Auto-regressive - Domain adaptation - Example based - Language model - Language processing - Natural languages - Processing algorithms - Target domain - Test examples - Training time;
D O I
10.1162/tacl_a_00468
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural Language Processing algorithms have made incredible progress, but they still struggle when applied to out-of-distribution examples. We address a challenging and underexplored version of this domain adaptation problem, where an algorithm is trained on several source domains, and then applied to examples from unseen domains that are unknown at training time. Particularly, no examples, labeled or unlabeled, or any other knowledge about the target domain are available to the algorithm at training time. We present PADA: An example-based autoregressive Prompt learning algorithm for on-the-fly Any-Domain Adaptation, based on the T5 language model. Given a test example, PADA first generates a unique prompt for it and then, conditioned on this prompt, labels the example with respect to the NLP prediction task. PADA is trained to generate a prompt that is a token sequence of unrestricted length, consisting of Domain Related Features (DRFs) that characterize each of the source domains. Intuitively, the generated prompt is a unique signature that maps the test example to a semantic space spanned by the source domains. In experiments with 3 tasks (text classification and sequence tagging), for a total of 14 multi-source adaptation scenarios, PADA substantially outperforms strong baselines.(1)
引用
收藏
页码:414 / 433
页数:20
相关论文
共 73 条
  • [21] Hu Weihua, 2018, PR MACH LEARN RES, V80
  • [22] How Can We Know What Language Models Know?
    Jiang, Zhengbao
    Xu, Frank F.
    Araki, Jun
    Neubig, Graham
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 : 423 - 438
  • [23] John Foster Sham, 2009, Zero-shot domain adaptation: A multi-view approach
  • [24] Vijayakumar AK, 2018, Arxiv, DOI arXiv:1610.02424
  • [25] Domain Attention with an Ensemble of Experts
    Kim, Young-Bum
    Stratos, Karl
    Kim, Dongchan
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 643 - 653
  • [26] Kingma Diederik P., 2015, ICLR
  • [27] Le Scao T, 2021, 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), P2627
  • [28] Lekhtman E, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P219
  • [29] Lester B, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P3045
  • [30] Lewis M., 2020, P 58 ANN M ASS COMP, P7871, DOI DOI 10.18653/V1/2020.ACL-MAIN.703