Evaluating Entity Linking with Wikipedia

被引:139
作者
Hachey, Ben [1 ]
Radford, Will [2 ,3 ]
Nothman, Joel [2 ,3 ]
Honnibal, Matthew [4 ]
Curran, James R. [2 ,3 ]
机构
[1] Thomson Reuters Corp, Res & Dev, St Paul, MN 55123 USA
[2] Univ Sydney, Sch Informat Technol, Sydney, NSW 2006, Australia
[3] Capital Markets CRC, Sydney, NSW 2000, Australia
[4] Macquarie Univ, Dept Comp, N Ryde, NSW 2109, Australia
关键词
Named Entity Linking; Disambiguation; Information extraction; Wikipedia; Semi-structured resources; WEB;
D O I
10.1016/j.artint.2012.04.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named Entity Linking (NEL) grounds entity mentions to their corresponding node in a Knowledge Base (KB). Recently,. a number of systems have been proposed for linking entity mentions in text to Wikipedia pages. Such systems typically search for candidate entities and then disambiguate them, returning either the best candidate or NIL. However, comparison has focused on disambiguation accuracy, making it difficult to determine how search impacts performance. Furthermore, important approaches from the literature have not been systematically compared on standard data sets. We reimplement three seminal NEL. systems and present a detailed evaluation of search strategies. Our experiments find that coreference and acronym handling lead to substantial improvement, and search strategies account for much of the variation between systems. This is an interesting finding, because these aspects of the problem have often been neglected in the literature, which has focused largely on complex candidate ranking algorithms. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:130 / 150
页数:21
相关论文
共 54 条
[1]  
[Anonymous], 2007, P 16 ACM C INF KNOWL, DOI DOI 10.1145/1321440.1321449
[2]  
[Anonymous], 2010, P 2010 C N AM CHAPT
[3]  
[Anonymous], 2006, P ACMSIGKDD INT C KN
[4]  
[Anonymous], 2007, Proceedings of the 16th ACM Conference on Con- ference on Information and Knowledge Management, DOI DOI 10.1145/1321440.1321475.19
[5]  
[Anonymous], 2007, Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), DOI DOI 10.3115/1621474.1621486
[6]  
[Anonymous], 2008, Proceedings of the 17th ACM conference on Information and knowledge management
[7]  
[Anonymous], 2011, P 2011 C EMPIRICAL M, DOI DOI 10.3115/V1/D11-1072
[8]  
[Anonymous], 2009, P IJCAI WORKSH US CO
[9]  
[Anonymous], 2011, P 49 ANN M ASS COMP
[10]  
[Anonymous], 2010, P 23 INT C COMP LING, DOI 10.3115/1119176.1119181