A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation

被引：150

作者：

Guan, Jian ^{[1
,3
,4
]}

Huang, Fei ^{[1
,3
,4
]}

Zhao, Zhihao ^{[2
]}

Zhu, Xiaoyan ^{[1
,3
,4
]}

Huang, Minlie ^{[1
,3
,4
]}

机构：

[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China

[2] Beihang Univ, Sch Software, Beijing, Peoples R China

[3] Inst Artificial Intelligence, State Key Lab Intelligent Technol & Syst, Hong Kong, Peoples R China

[4] Beijing Natl Res Ctr Informat Sci & Technol, Beijing, Peoples R China

来源：

TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS | 2020年 / 8卷

基金：

美国国家科学基金会; 国家重点研发计划;

关键词：

Computational linguistics - Computer circuits - Learning systems;

D O I：

10.1162/tacl_a_00302

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Story generation, namely, generating a reasonable story from a leading context, is an important but challenging task. In spite of the success in modeling fluency and local coherence, existing neural language generation models (e.g., GPT-2) still suffer from repetition, logic conflicts, and lack of long-range coherence in generated stories. We conjecture that this is because of the difficulty of associating relevant commonsense knowledge, understanding the causal relationships, and planning entities and events with proper temporal order. In this paper, we devise a knowledge-enhanced pretraining model for commonsense story generation.We propose to utilize commonsense knowledge from external knowledge bases to generate reasonable stories. To further capture the causal and temporal dependencies between the sentences in a reasonable story, we use multi-task learning, which combines a discriminative objective to distinguish true and fake stories during fine-tuning. Automatic and manual evaluation shows that our model can generate more reasonable stories than state-of-the-art baselines, particularly in terms of logic and global coherence.

引用

页码：93 / 108

页数：16

共 55 条

[1]

Alt C, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P1388

[2]

[Anonymous], 2010, TEXT MINING APPL THE

[3]

Bosselut A, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P4762

[4]

Bowman Samuel R., 2015, EMNLP, P632, DOI 10.18653/v1/D15-1075

[5]

Clark Elizabeth, 2018, NAACL, P1631

[6]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[7]

Dongwook Lee, 2019, arXiv

[8]

Fan A, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P2650

[9]

Fan A, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P889

[10]

FLEISS JL, 1971, PSYCHOL BULL, V76, P378, DOI 10.1037/h0031619

← 1 2 3 4 5 6 →