TechMiner: Extracting Technologies from Academic Publications

被引:7
作者
Osborne, Francesco [1 ]
de Ribaupierre, Helene [1 ,2 ]
Motta, Enrico [1 ,2 ]
机构
[1] Open Univ, Knowledge Media Inst, Milton Keynes, Bucks, England
[2] Univ Oxford, Dept Comp Sci, Oxford OX1 2JD, England
来源
KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT, EKAW 2016 | 2016年 / 10024卷
关键词
Scholarly data; Ontology learning; Bibliographic data; Scholarly ontologies; Data mining;
D O I
10.1007/978-3-319-49004-5_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years we have seen the emergence of a variety of scholarly datasets. Typically these capture 'standard' scholarly entities and their connections, such as authors, affiliations, venues, publications, citations, and others. However, as the repositories grow and the technology improves, researchers are adding new entities to these repositories to develop a richer model of the scholarly domain. In this paper, we introduce TechMiner, a new approach, which combines NLP, machine learning and semantic technologies, for mining technologies from research publications and generating an OWL ontology describing their relationships with other research entities. The resulting knowledge base can support a number of tasks, such as: richer semantic search, which can exploit the technology dimension to support better retrieval of publications; richer expert search; monitoring the emergence and impact of new technologies, both within and across scientific fields; studying the scholarly dynamics associated with the emergence of new technologies; and others. TechMiner was evaluated on a manually annotated gold standard and the results indicate that it significantly outperforms alternative NLP approaches and that its semantic features improve performance significantly with respect to both recall and precision.
引用
收藏
页码:463 / 479
页数:17
相关论文
共 26 条
[11]  
de Ribaupierre H., 2014, P 14 INT C KNOWL TEC
[12]   Use of the Internet in Scanning the Horizon for New and Emerging Health Technologies: A Survey of Agencies Involved in Horizon Scanning [J].
Douw, Karla ;
Vondeling, Hindrik ;
Eskildsen, Drea ;
Simpson, Sue .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2003, 5 (01) :33-45
[13]  
Dumontier M., 2014, 2014 INT SEM WEB C
[14]  
Glaser H., 2009, P WEB SCI 2009 ATH G
[15]   Using Typed Dependencies to Study and Recognise Conceptualisation Zones in Biomedical Literature [J].
Groza, Tudor .
PLOS ONE, 2013, 8 (11)
[16]   DO ABCs GET MORE CITATIONS THAN XYZs? [J].
Huang, Wei .
ECONOMIC INQUIRY, 2015, 53 (01) :773-789
[17]  
Ibekwe-SanJuan F., 2011, ARXIV11105722
[18]  
Liakata M., 2010, LREC
[19]  
Moller K., 2007, 6 INT SEM WEB C 11 1
[20]  
O'Seaghdha D, 2014, P 25 INT C COMP LING