Document-level relation extraction using evidence reasoning on RST-GRAPH

被引:38
作者
Wang, Hailin [1 ,2 ]
Qin, Ke [1 ,2 ]
Lu, Guoming [1 ,2 ]
Yin, Jin [1 ,2 ]
Zakari, Rufai Yusuf [1 ,2 ]
Owusu, Jim Wilson [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, 2006 Xiyuan Ave, Chengdu 611731, Sichuan, Peoples R China
[2] Trusted Cloud Comp & Big Data Key Lab Sichuan Pro, Chengdu, Peoples R China
关键词
Relation extraction; Rhetorical structure theory; Document-level; Evidence reasoning;
D O I
10.1016/j.knosys.2021.107274
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Document-level relation extraction (RE) is a more challenging task providing a new perspective to solve larger and more complex text mining work. Recent document-level RE research following traditional sentence-level methods focus on learning a character representation of one sentence, highlighting the importance of partial words. When applied to a document with a longer text, more entities and more complicated semantics, these traditional methods lacking the ability to select evidence and further reason on them, may not be sufficient to identify all potential relation in the document in an intuitive way. However, some textual semantic associations between entities as well as logical structure of document could provide evidence to the potential relation and explain the reasoning process. Hence, by introducing Rhetorical Structure Theory (RST) as an external knowledge, this article attempts to select appropriate evidence and show reasoning process on a new document-graph, RST-GRAPH, which indicators valid semantic associations between multiple text units through RST and incorporates a set of reasoning modules to capture efficient evidence. Our experimental result shows that the RST-GRAPH, as the first work introducing RST, builds associations in an interpretable manner along discourse relation link between any entities compared to previous graph-oriented models, exhibits a clear process of evidence reasoning with competitive performance on DocRED dataset, and outperforms most existing models on the value of Ign F1. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:12
相关论文
共 49 条
[1]  
Agichtein E., 2000, ACM 2000. Digital Libraries. Proceedings of the Fifth ACM Conference on Digital Libraries, P85, DOI 10.1145/336597.336644
[2]  
[Anonymous], 2006, INT C APPL NATURAL L
[3]  
[Anonymous], 2014, P 2014 C EMP METH NA
[4]  
[Anonymous], 2020, ARXIV PREPRINT ARXIV
[5]  
Brin S, 1999, LECT NOTES COMPUT SC, V1590, P172
[6]  
Cai R, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P756
[7]  
ChunYang Liu, 2013, Advanced Data Mining and Applications. 9th International Conference, ADMA 2013. Proceedings: LNCS 8347, P231, DOI 10.1007/978-3-642-53917-6_21
[8]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[9]  
Dyer C, 2015, PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, P334
[10]  
Eberts M, 2021, 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), P3650