Interaction and Fusion of Rich Textual Information Network for Document-level Relation Extraction

被引：0

作者：

Zhong, Yu ^{[1
]}

Shen, Bo ^{[1
]}

Wang, Tao ^{[1
]}

Zhang, Jinglin ^{[1
]}

Liu, Yun ^{[1
]}

机构：

[1] Beijing Jiaotong Univ, Beijing, Peoples R China

来源：

JOURNAL OF UNIVERSAL COMPUTER SCIENCE | 2024年 / 30卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Natural language processing; Document-level relation extraction; Graph convolutional network;

D O I：

10.3897/jucs.130588

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Detecting relations between entities across multiple sentences in a document, referred to as document-level relation extraction, poses a challenge in natural language processing. Graph networks have gained widespread application for their ability to capture long-range contextual dependencies in documents. However, previous studies have often been limited to using only two to three types of nodes to construct document graphs. This leads to insufficient utilization of the rich information within the documents and inadequate aggregation of contextual information. Additionally, relevant relationship labels often co-occur in documents, yet existing methods rarely model the dependencies of relationship labels. In this paper, we propose the Interaction and Fusion of Rich Textual Information Network (IFRTIN) that simultaneously considers multiple types of nodes. First, we utilize the structural, syntactic, and discourse information in the document to construct a document graph, capturing global dependency relationships. Next, we design a regularizer to encourage the model to capture dependencies of relationship labels. Furthermore, we design an Adaptive Encouraging Loss, which encourages well-classified instances to contribute more to the overall loss, thereby enhancing the effectiveness of the model. Experimental results demonstrate that our approach achieves a significant improvement on three document-level relation extraction datasets. Specifically, IFRTIN outperforms existing models by achieving an F1 score improvement of 0.67% on Dataset DocRED, 1.2% on Dataset CDR, and 1.3% on Dataset GDA. These results highlight the effectiveness of our approach in leveraging rich textual information and modeling label dependencies for document-level relation extraction.

引用

页码：1112 / 1136

页数：25

共 58 条

[1]

Agichtein E., 2000, ACM 2000. Digital Libraries. Proceedings of the Fifth ACM Conference on Digital Libraries, P85, DOI 10.1145/336597.336644

[2]

Anitha, 2020, Systems Simulation and Modeling for Cloud Computing and Big Data Applications, P29

[3]

Belfin R V., 2020, Recent Advances in Hybrid Metaheuristics for Data Clustering, P61, DOI DOI 10.1002/9781119551621.CH4

[4]

Beltagy I, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P3615

[5]

Christopoulou F, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P4925

[6]

Devlin J, 2019, Arxiv, DOI arXiv:1810.04805

[7] Multi-perspective context aggregation for document-level relation extraction [J].

Ding, Xiaoyao ;

Zhou, Gang ;

Zhu, Taojie .

APPLIED INTELLIGENCE, 2023, 53 (06) :6926-6935

[8] Relational distance and document-level contrastive pre-training based relation extraction model [J].

Dong, Yihao ;

Xu, Xiaolong .

PATTERN RECOGNITION LETTERS, 2023, 167 :132-140

[9]

Eberts M, 2021, 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), P3650

[10]

Guo ZJ, 2020, Arxiv, DOI arXiv:1906.07510

← 1 2 3 4 5 6 →