Leveraging Knowledge Graph Embeddings to Enhance Contextual Representations for Relation Extraction

被引:1
作者
Laleye, Frejus A. A. [1 ]
Rakotoson, Loic [1 ]
Massip, Sylvain [1 ]
机构
[1] Opscidia, Paris, France
来源
DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2023 WORKSHOPS, PT II | 2023年 / 14194卷
关键词
Relation extraction; Knowledge Graph Embeddings; Contextual representation;
D O I
10.1007/978-3-031-41501-2_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Relation extraction task is a crucial and challenging aspect of Natural Language Processing. Several methods have surfaced as of late, exhibiting notable performance in addressing the task; however, most of these approaches rely on vast amounts of data from large-scale knowledge graphs or language models pretrained on voluminous corpora. In this paper, we hone in on the effective utilization of solely the knowledge supplied by a corpus to create a high-performing model. Our objective is to showcase that by leveraging the hierarchical structure and relational distribution of entities within a corpus without introducing external knowledge, a relation extraction model can achieve significantly enhanced performance. We therefore proposed a relation extraction approach based on the incorporation of pretrained knowledge graph embeddings at the corpus scale into the sentence-level contextual representation. We conducted a series of experiments which revealed promising and very interesting results for our proposed approach. The obtained results demonstrated an outperformance of our method compared to context-based relation extraction models.
引用
收藏
页码:19 / 31
页数:13
相关论文
共 27 条
[1]  
Beltagy I, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P3615
[2]  
Bordes A., 2013, P ADV NEUR INF PROC, P2787, DOI DOI 10.5555/2999792.2999923
[3]   Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research [J].
Bravo, Alex ;
Pinero, Janet ;
Queralt-Rosinach, Nuria ;
Rautschka, Michael ;
Furlong, Laura I. .
BMC BIOINFORMATICS, 2015, 16
[4]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[5]  
Dong Z., 2010, INT C COMP LING
[6]   Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing [J].
Gu, Yu ;
Tinn, Robert ;
Cheng, Hao ;
Lucas, Michael ;
Usuyama, Naoto ;
Liu, Xiaodong ;
Naumann, Tristan ;
Gao, Jianfeng ;
Poon, Hoifung .
ACM TRANSACTIONS ON COMPUTING FOR HEALTHCARE, 2022, 3 (01)
[7]   The DDI corpus: An annotated corpus with pharmacological substances and drug-drug interactions [J].
Herrero-Zazo, Maria ;
Segura-Bedmar, Isabel ;
Martinez, Paloma ;
Declerck, Thierry .
JOURNAL OF BIOMEDICAL INFORMATICS, 2013, 46 (05) :914-920
[8]  
Huang KX, 2020, Arxiv, DOI arXiv:1904.05342
[9]  
Gutiérrez BJ, 2022, Arxiv, DOI [arXiv:2203.08410, 10.48550/arXiv.2203.08410, DOI 10.48550/ARXIV.2203.08410]
[10]   SpanBERT: Improving Pre-training by Representing and Predicting Spans [J].
Joshi, Mandar ;
Chen, Danqi ;
Liu, Yinhan ;
Weld, Daniel S. ;
Zettlemoyer, Luke ;
Levy, Omer .
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 :64-77