TechMiner: Extracting Technologies from Academic Publications

被引:7
作者
Osborne, Francesco [1 ]
de Ribaupierre, Helene [1 ,2 ]
Motta, Enrico [1 ,2 ]
机构
[1] Open Univ, Knowledge Media Inst, Milton Keynes, Bucks, England
[2] Univ Oxford, Dept Comp Sci, Oxford OX1 2JD, England
来源
KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT, EKAW 2016 | 2016年 / 10024卷
关键词
Scholarly data; Ontology learning; Bibliographic data; Scholarly ontologies; Data mining;
D O I
10.1007/978-3-319-49004-5_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years we have seen the emergence of a variety of scholarly datasets. Typically these capture 'standard' scholarly entities and their connections, such as authors, affiliations, venues, publications, citations, and others. However, as the repositories grow and the technology improves, researchers are adding new entities to these repositories to develop a richer model of the scholarly domain. In this paper, we introduce TechMiner, a new approach, which combines NLP, machine learning and semantic technologies, for mining technologies from research publications and generating an OWL ontology describing their relationships with other research entities. The resulting knowledge base can support a number of tasks, such as: richer semantic search, which can exploit the technology dimension to support better retrieval of publications; richer expert search; monitoring the emergence and impact of new technologies, both within and across scientific fields; studying the scholarly dynamics associated with the emergence of new technologies; and others. TechMiner was evaluated on a manually annotated gold standard and the results indicate that it significantly outperforms alternative NLP approaches and that its semantic features improve performance significantly with respect to both recall and precision.
引用
收藏
页码:463 / 479
页数:17
相关论文
共 26 条
[1]  
[Anonymous], 1998, WordNet, DOI DOI 10.7551/MITPRESS/7287.001.0001
[2]  
[Anonymous], 2011, P 7 INT C SEM SYST, DOI [10.1145/2063518.2063519, DOI 10.1145/2063518.2063519]
[3]  
Augenstein Isabelle, 2012, The Semantic Web: Research and Applications. Proceedings 9th Extended Semantic Web Conference (ESWC 2012), P210, DOI 10.1007/978-3-642-30284-8_21
[4]   The Resource Identification Initiative: A cultural shift in publishing [J].
Bandrowski, Anita ;
Brush, Matthew ;
Grethe, Jeffery S. ;
Haendel, Melissa A. ;
Kennedy, David N. ;
Hill, Sean ;
Hof, Patrick R. ;
Martone, Maryann E. ;
Pols, Maaike ;
Tan, Serena C. ;
Washington, Nicole ;
Zudilova-Seinstra, Elena ;
Vasilevsky, Nicole .
JOURNAL OF COMPARATIVE NEUROLOGY, 2016, 524 (01) :8-22
[5]   DBpedia - A crystallization point for the Web of Data [J].
Bizer, Christian ;
Lehmann, Jens ;
Kobilarov, Georgi ;
Auer, Soeren ;
Becker, Christian ;
Cyganiak, Richard ;
Hellmann, Sebastian .
JOURNAL OF WEB SEMANTICS, 2009, 7 (03) :154-165
[6]  
Bordea Georgeta, 2013, 10 INT C TERM ART IN
[7]  
Carpenter B., 2007, P 2 BIOCREATIVE CHAL, P307
[8]   Cascaded classifiers for confidence-based chemical named entity recognition [J].
Corbett, Peter ;
Copestake, Ann .
BMC BIOINFORMATICS, 2008, 9 (Suppl 11)
[9]  
de Ribaupierre H., 2015, P 5 INT WORKSH SEM D
[10]  
de Ribaupierre H., 2016, P 25 INT C WORLD WID