Ontology-Driven Information Extraction from Research Publications

被引:5
作者
Pertsas, Vayianos [1 ]
Constantopoulos, Panos [1 ,2 ]
机构
[1] Athens Univ Econ & Business, Dept Informat, Athens, Greece
[2] Athena Res Ctr, Digital Curat Unit, Athens, Greece
来源
DIGITAL LIBRARIES FOR OPEN KNOWLEDGE, TPDL 2018 | 2018年 / 11057卷
关键词
Information extraction from text; Ontology population; Linked data; Knowledge base creation;
D O I
10.1007/978-3-030-00066-0_21
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extraction of information from a research article, association with other sources and inference of new knowledge is a challenging task that has not yet been entirely addressed. We present Research Spotlight, a system that leverages existing information from DBpedia, retrieves articles from repositories, extracts and interrelates various kinds of named and non-named entities by exploiting article metadata, the structure of text as well as syntactic, lexical and semantic constraints, and populates a knowledge base in the form of RDF triples. An ontology designed to represent scholarly practices is driving the whole process. The system is evaluated through two experiments that measure the overall accuracy in terms of token- and entity- based precision, recall and F1 scores, as well as entity boundary detection, with promising results.
引用
收藏
页码:241 / 253
页数:13
相关论文
共 14 条
[1]   Automatic ontology-based knowledge extraction from web documents [J].
Alani, H ;
Kim, S ;
Millard, DE ;
Weal, MJ ;
Hall, W ;
Lewis, PH ;
Shadbolt, NR .
IEEE INTELLIGENT SYSTEMS, 2003, 18 (01) :14-21
[2]  
[Anonymous], WWW 2006 WORKSH EV O
[3]  
[Anonymous], 2008, Introduction to information retrieval
[4]   Ontology-based information extraction and integration from heterogeneous data sources [J].
Buitelaar, Paul ;
Cimiano, Philipp ;
Frank, Anette ;
Hartung, Matthias ;
Racloppa, Stefania .
INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 2008, 66 (11) :759-788
[5]  
Celjuska D., 2004, ICON 2004
[6]  
De Sitter A., 2004, FORMAL FRAMEWORK EVA
[7]  
Gerber D, 2013, LECT NOTES COMPUT SC, V8218, P135, DOI 10.1007/978-3-642-41335-3_9
[8]  
Jurafsky D., 2017, SPEECH LANGUAGE PROC, V3rd
[9]   DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia [J].
Lehmann, Jens ;
Isele, Robert ;
Jakob, Max ;
Jentzsch, Anja ;
Kontokostas, Dimitris ;
Mendes, Pablo N. ;
Hellmann, Sebastian ;
Morsey, Mohamed ;
van Kleef, Patrick ;
Auer, Soeren ;
Bizer, Christian .
SEMANTIC WEB, 2015, 6 (02) :167-195
[10]  
Makki J., 2008, INT J HUMANIT SOC SC, V3, P212