Stabilized In-Context Learning with Pre-trained Language Models for Few Shot Dialogue State Tracking

被引:0
|
作者
Chen, Derek [1 ]
Qian, Kun [1 ]
Yu, Zhou [1 ]
机构
[1] Columbia Univ, Dialogue NLP Lab, New York, NY 10027 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prompt-based methods with large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks. These models improve even further with the addition of a few labeled in-context exemplars to guide output generation. However, for more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial, leading to unstable results. Furthermore, building incontext exemplars for dialogue tasks is difficult because conversational contexts are long while model input lengths are relatively short. To overcome these issues we first adapt a meta-learning scheme to the dialogue domain which stabilizes the ability of the model to perform well under various prompts. We additionally design a novel training method to improve upon vanilla retrieval mechanisms to find ideal in-context examples. Finally, we introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query. In effect, we are able to achieve highly competitive results for few-shot DST on MultiWOZ.
引用
收藏
页码:1551 / 1564
页数:14
相关论文
共 50 条
  • [1] In-Context Analogical Reasoning with Pre-Trained Language Models
    Hu, Xiaoyang
    Storks, Shane
    Lewis, Richard L.
    Chai, Joyce
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1953 - 1969
  • [2] Making Pre-trained Language Models Better Few-shot Learners
    Gao, Tianyu
    Fisch, Adam
    Chen, Danqi
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 3816 - 3830
  • [3] Investigating Prompt Learning for Chinese Few-Shot Text Classification with Pre-Trained Language Models
    Song, Chengyu
    Shao, Taihua
    Lin, Kejing
    Liu, Dengfeng
    Wang, Siyuan
    Chen, Honghui
    APPLIED SCIENCES-BASEL, 2022, 12 (21):
  • [4] Few-Shot NLG with Pre-Trained Language Model
    Chen, Zhiyu
    Eavani, Harini
    Chen, Wenhu
    Liu, Yinyin
    Wang, William Yang
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 183 - 190
  • [5] Pathologies of Pre-trained Language Models in Few-shot Fine-tuning
    Chen, Hanjie
    Zheng, Guoqing
    Awadallah, Ahmed Hassan
    Ji, Yangfeng
    PROCEEDINGS OF THE THIRD WORKSHOP ON INSIGHTS FROM NEGATIVE RESULTS IN NLP (INSIGHTS 2022), 2022, : 144 - 153
  • [6] Extracting Business Process Entities and Relations from Text Using Pre-trained Language Models and In-Context Learning
    Bellan, Patrizio
    Dragoni, Mauro
    Ghidini, Chiara
    ENTERPRISE DESIGN, OPERATIONS, AND COMPUTING, EDOC 2022, 2022, 13585 : 182 - 199
  • [7] TOKEN Is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models
    Davody, Ali
    Adelani, David Ifeoluwa
    Kleinbauer, Thomas
    Klakow, Dietrich
    TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 138 - 150
  • [8] Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks
    Xi, Zhaohan
    Du, Tianyu
    Li, Changjiang
    Pang, Ren
    Ji, Shouling
    Chen, Jinghui
    Ma, Fenglong
    Wang, Ting
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [9] Context Analysis for Pre-trained Masked Language Models
    Lai, Yi-An
    Lalwani, Garima
    Zhang, Yi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 3789 - 3804
  • [10] Zero-shot language extension for dialogue state tracking via pre-trained models and multi-auxiliary-tasks fine-tuning
    Xiang, Lu
    Zhao, Yang
    Zhu, Junnan
    Zhou, Yu
    Zong, Chengqing
    KNOWLEDGE-BASED SYSTEMS, 2023, 259