Relational distance and document-level contrastive pre-training based relation extraction model

被引:14
作者
Dong, Yihao [1 ]
Xu, Xiaolong [2 ]
机构
[1] Nanjing Univ Posts & Telecommun, Jiangsu Key Lab Big Data Secur & Intelligent Proc, 9 Wenyuan Rd, Nanjing 210023, Peoples R China
[2] Nanjing Univ Posts & Telecommun, Sch Comp Sci, 9 Wenyuan Rd, Nanjing 210023, Peoples R China
关键词
Extraction - Graphic methods;
D O I
10.1016/j.patrec.2023.02.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Document-level relation extraction has multi-entity and multi-mention compared to sentence-level, ex-isting sentence-level relation extraction models cannot meet the requirements of document-level relation extraction. Existing graph-based document-level models usually design points and edges manually, which often introduces man-made noise; while the Transformer-based models cannot deeply solve the difficul-ties such as coreference resolution by designing pre-training tasks or other methods. In this paper, we propose a new Relational Distance and Document-level Contrastive Pre-training (RDDCP) based relation extraction model, which achieves coreference resolution by simple and effective mention replacement; we also introduce the concept of relational distance to achieve document-level contrastive pre-training, and find the most likely relational mention pairs from the plural mention pairs existing in the document -level dataset for contrastive learning; for the relation information in distant mentions ignored by the re-lational distance, we quantified the distances as weights and incorporated the information with weights into the embedding representation of entities. Each entity presents different entity embedding represen-tations in different entity pairs. We conducted experiments on three popular datasets and the RDDCP model outperformed GAIN, SSAN and ATLOP as well as other baseline models in terms of performance and time complexity.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:132 / 140
页数:9
相关论文
共 52 条
[31]  
Vaswani A, 2017, ADV NEUR IN, V30
[32]  
Wang DF, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P3711
[33]  
Wang H, 2019, Arxiv, DOI [arXiv:1909.11898, DOI 10.48550/ARXIV.1909.11898]
[34]  
Wolf T, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING: SYSTEM DEMONSTRATIONS, P38
[35]   Enriching Pre-trained Language Model with Entity Information for Relation Classification [J].
Wu, Shanchan ;
He, Yifan .
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, :2361-2364
[36]  
Wu Y., 2019, RENET: A Deep Learning Approach for Extracting Gene-Disease Associations from Literature
[37]  
Xiao Chaojun, 2020, P EMNLP
[38]  
Xu Benfeng, 2021, P AAAI
[39]  
Yao Y, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P764
[40]  
Ye DM, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P7170