Pre-training Multi-party Dialogue Models with Latent Discourse Inference

被引：0

作者：

Li, Yiyang ^{[1
,2
]}

Huang, Xinting ^{[3
]}

Bi, Wei ^{[3
]}

Zhao, Hai ^{[1
,2
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China

[2] Shanghai Jiao Tong Univ, Key Lab Shanghai Educ Commiss Intelligent Interac, Shanghai, Peoples R China

[3] Tencent AI Lab, NLP Ctr, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1 | 2023年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-party dialogues are more difficult for models to understand than one-to-one two-party dialogues, since they involve multiple interlocutors, resulting in interweaving reply-to relations and information flows. To step over these obstacles, an effective way is to pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying. However, due to the lack of explicitly annotated discourse labels in multi-party dialogue corpora, previous works fail to scale up the pre-training process by putting aside the unlabeled multi-party conversational data for nothing. To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model by unsupervised latent variable inference methods. Experiments on multiple downstream tasks show that our pre-trained model outperforms strong baselines by large margins and achieves state-of-the-art (SOTA) results, justifying the effectiveness of our method. The official implementation of this paper is available at https://github.com/EricLee8/MPD_EMVI.

引用

页码：9584 / 9599

页数：16

共 35 条

[11] He Y., 2021, PACLIC, P551
[12] Hu WP, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P5010
[13] Jang E., 2017, INT C LEARN REPR
[14] Jia Q, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P1911
[15] Kingma D. P., 2014, arXiv
[16] Li Jiaqi, 2020, P 28 INT C COMPUTATI, P2642, DOI DOI 10.18653/V1/2020.COLING-MAIN.238
[17] Time-varying additive model with autoregressive errors for locally stationary time series
Li, Jiyanglin
Li, Tao
[J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2023, 52 (11) : 3848 - 3878
[18] Li YY, 2021, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, P2053
[19] Li Yiyang, 2023, PRETRAINING MULTIPAR
[20] RoBERTa: A Robustly Optimized BERT Pretraining Approach
Liu, Yinhan
Ott, Myle
Goyal, Naman
Du, Jingfei
Joshi, Mandar
Chen, Danqi
Levy, Omer
Lewis, Mike
Zettlemoyer, Luke
Stoyanov, Veselin
[J]. INFORMATION SYSTEMS RESEARCH, 2019,

← 1 2 3 4 →