A Framework for Relationship Extraction from Unstructured Text via Link Grammar Parsing

被引:0
作者
Samuel, Kenneth [1 ]
Savas, Onur [1 ]
Manikonda, Vikram [1 ]
机构
[1] Intelligent Automat Inc, 15400 Calhoun Dr,Ste 190, Rockville, MD 20855 USA
来源
NEXT-GENERATION ANALYST VI | 2018年 / 10653卷
关键词
Text analytics; natural language processing; NLP; relationship extraction; named entity recognition; unstructured text; data analytics; link grammar;
D O I
10.1117/12.2306550
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A major task in information extraction is to extract relations between named entities. Relation extraction not only builds and extends knowledge bases and ontologies but also supports downstream application processing such as graph mining In this paper, we report a relation extraction framework based on the natural language theory of link grammar Our methodology uses and extends Akbik and Bro beta' s Wanderlust approach, where linguistic paths that are defined over the dependency grammar of sentences guide the relation extraction process. In particular, our framework splits a document into sentences, creates a dependency tree of each sentence, tags and categorizes entities, and extract relations between these entities. The accuracy of our framework is parametrized with the choice of linguistic paths, and accuracy scores as high as 95% precision, 36% recall, and 44% f-score are obtained. We also envision natural extensions of our work, where cross-sentence references are resolved and/or the context and content of the sentence constrains the linguistic paths.
引用
收藏
页数:17
相关论文
共 14 条
[1]  
Agichtein E., 2000, ACM 2000. Digital Libraries. Proceedings of the Fifth ACM Conference on Digital Libraries, P85, DOI 10.1145/336597.336644
[2]  
Akbik Alan, 2009, SEM SEARCH SEMSEARCH
[3]  
[Anonymous], 2016, Reuters
[4]  
[Anonymous], 2006, P 12 ACM SIGKDD INT, DOI DOI 10.1145/1150402.1150492
[5]   DBpedia: A nucleus for a web of open data [J].
Auer, Soeren ;
Bizer, Christian ;
Kobilarov, Georgi ;
Lehmann, Jens ;
Cyganiak, Richard ;
Ives, Zachary .
SEMANTIC WEB, PROCEEDINGS, 2007, 4825 :722-+
[6]  
Banerjee S., 2002, Computational Linguistics and Intelligent Text Processing. Third International Conference, CICLing 2002. Proceedings (Lecture Notes in Computer Science Vol.2276), P136
[7]  
Banko Michele., 2007, P IJCAI
[8]  
Brin S, 1999, LECT NOTES COMPUT SC, V1590, P172
[9]  
Doddington G. R., The automatic content extraction (ACE) program-tasks, data, and evaluation
[10]  
Grisham R., JAVA EXTRACTION TOOL