Self-training on refined clause patterns for relation extraction

被引:23
作者
Duc-Thuan Vo [1 ]
Bagheri, Ebrahim [1 ]
机构
[1] Ryerson Univ, Lab Syst Software & Semant LS3, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Relation extraction; Open information extraction; Self-training algorithm; Syntactic parsing; Dependency parsing; INFORMATION EXTRACTION; KNOWLEDGE; DOMAIN;
D O I
10.1016/j.ipm.2017.02.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Within the context of Information Extraction (IE), relation extraction is oriented towards identifying a variety of relation phrases and their arguments in arbitrary sentences. In this paper, we present a clause-based framework for information extraction in textual documents. Our framework focuses on two important challenges in information extraction: 1) Open Information Extraction and (OIE), and 2) Relation Extraction (RE). In the plethora of research that focus on the use of syntactic and dependency parsing for the purposes of detecting relations, there has been increasing evidence of incoherent and uninformative extractions. The extracted relations may even be erroneous at times and fail to provide a meaningful interpretation. In our work, we use the English clause structure and clause types in an effort to generate propositions that can be deemed as extractable relations. Moreover, we propose refinements to the grammatical structure of syntactic and dependency parsing that help reduce the number of incoherent and uninformative extractions from clauses. In our experiments both in the open information extraction and relation extraction domains, we carefully evaluate our system on various benchmark datasets and compare the performance of our work against existing state-of-the-art information extraction systems. Our work shows improved performance compared to the state-of-the-art techniques. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:686 / 706
页数:21
相关论文
共 64 条
[1]  
Abacha A. B., 2016, INFORM PROCESSING MA, V51, P570
[2]  
Agichtein E., 2000, P 5 ACM C DIG LIB 20
[3]  
Akbik A., 2012, Proceedings of the 24th International Conference on Computational Linguistics, P17
[4]  
Angeli Gabor, 2014, P 2014 C EMP METH NA
[5]  
[Anonymous], 2005, P HUM LANG TECHN C C
[6]  
[Anonymous], 2014, P 2014 C EMP METH NA
[7]  
[Anonymous], P WEBDB, DOI DOI 10.1007/I
[8]  
[Anonymous], 2013, P C N AM CHAPT ASS C
[9]  
[Anonymous], P 18 INT C COMP LING
[10]  
[Anonymous], 2008, COLING 2008 P WORKSH