Robust Dialogue State Tracking with Weak Supervision and Sparse Data

被引:8
作者
Heck, Michael [1 ]
Lubis, Nurul [1 ]
van Niekerk, Carel [1 ]
Feng, Shutong [1 ]
Geishauser, Christian [1 ]
Lin, Hsien-Chin [1 ]
Gasic, Milica [1 ]
机构
[1] Heinrich Heine Univ Dusseldorf, Dusseldorf, Germany
基金
欧洲研究理事会;
关键词
64;
D O I
10.1162/tacl_a_00513
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generalizing dialogue state tracking (DST) to new data is especially challenging due to the strong reliance on abundant and fine-grained supervision during training. Sample sparsity, distributional shift, and the occurrence of new concepts and topics frequently lead to severe performance degradation during inference. In this paper we propose a training strategy to build extractive DST models without the need for fine-grained manual span labels. Two novel input-level dropout methods mitigate the negative impact of sample sparsity. We propose a new model architecture with a unified encoder that supports value as well as slot independence by leveraging the attention mechanism. We combine the strengths of triple copy strategy DST and value matching to benefit from complementary predictions without violating the principle of ontology independence. Our experiments demonstrate that an extractive DST model can be trained without manual span labels. Our architecture and training strategies improve robustness towards sample sparsity, new concepts, and topics, leading to state-of-the-art performance on a range of benchmarks. We further highlight our model's ability to effectively learn from non-dialogue data.
引用
收藏
页码:1175 / 1192
页数:18
相关论文
共 64 条
[1]  
[Anonymous], 2015, P 3 INT C LEARNING R
[2]  
Ba J. L., 2016, Advances in Neural Information Processing Systems (NeurIPS), P1
[3]   Towards Zero-Shot Frame Semantic Parsing for Domain Scaling [J].
Bapna, Ankur ;
Tur, Gokhan ;
Hakkani-Tur, Dilek ;
Heck, Larry .
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, :2476-2480
[4]  
Budzianowski P, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P5016
[5]  
Campagna G, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P122
[6]   BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer [J].
Chao, Guan-Lin ;
Lane, Ian .
INTERSPEECH 2019, 2019, :1468-1472
[7]  
Cho Hyundong., 2021, CHECKDST MEASURING R
[8]  
CLARK HH, 1991, PERSPECTIVES ON SOCIALLY SHARED COGNITION, P127, DOI 10.1037/10096-006
[9]  
Dai YP, 2021, ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, P879
[10]   Survey on evaluation methods for dialogue systems [J].
Deriu, Jan ;
Rodrigo, Alvaro ;
Otegi, Arantxa ;
Echegoyen, Guillermo ;
Rosset, Sophie ;
Agirre, Eneko ;
Cieliebak, Mark .
ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (01) :755-810