Computing Semantic Similarity of Concepts in Knowledge Graphs

被引:155
作者
Zhu, Ganggao [1 ]
Iglesias, Carlos A. [1 ]
机构
[1] Univ Politecn Madrid, Escuela Tecn Super Ingn Telecomunicac, Avda Complutense 30, E-28040 Madrid, Spain
关键词
Semantic similarity; semantic relatedness; information content; knowledge graph; WordNet; DBpedia; REPRESENTATION; FREQUENCY; WORDNET; QUERIES;
D O I
10.1109/TKDE.2016.2610428
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a method for measuring the semantic similarity between concepts in Knowledge Graphs (KGs) such as WordNet and DBpedia. Previous work on semantic similarity methods have focused on either the structure of the semantic network between concepts (e.g., path length and depth), or only on the Information Content (IC) of concepts. We propose a semantic similarity method, namely wpath, to combine these two approaches, using IC to weight the shortest path length between concepts. Conventional corpus-based IC is computed from the distributions of concepts over textual corpus, which is required to prepare a domain corpus containing annotated concepts and has high computational cost. As instances are already extracted from textual corpus and annotated by concepts in KGs, graph-based IC is proposed to compute IC based on the distributions of concepts over instances. Through experiments performed on well known word similarity datasets, we show that the wpath semantic similarity method has produced a statistically significant improvement over other semantic similarity methods. Moreover, in a real category classification evaluation, the wpath method has shown the best performance in terms of accuracy and F score.
引用
收藏
页码:72 / 85
页数:14
相关论文
共 47 条
[1]  
[Anonymous], ARXIV14083456
[2]  
[Anonymous], 2009, N AM CHAPTER ASS COM
[3]  
[Anonymous], 2014, Transactions of the Association for Computational Linguistics, DOI [10.1162/tacl_a_00179, DOI 10.1162/TACL_A_00179]
[4]  
[Anonymous], 2012, P 21 ACM INT C INFOR
[5]   DBpedia - A crystallization point for the Web of Data [J].
Bizer, Christian ;
Lehmann, Jens ;
Kobilarov, Georgi ;
Auer, Soeren ;
Becker, Christian ;
Cyganiak, Richard ;
Hellmann, Sebastian .
JOURNAL OF WEB SEMANTICS, 2009, 7 (03) :154-165
[6]  
Bollacker K., 2008, P 2008 ACM SIGMOD IN, P1247, DOI DOI 10.1145/1376616.1376746
[7]  
Budanitsky A, 2006, COMPUT LINGUIST, V32, P13, DOI 10.1162/coli.2006.32.1.13
[8]  
Church K., 1999, TEXT SPEECH LANG TEC, P283
[9]  
Church K. W., 1990, Computational Linguistics, V16, P22
[10]   A conceptual representation of documents and queries for information retrieval systems by using light ontologies [J].
Dragoni, Mauro ;
Pereira, Celia da Costa ;
Tettamanzi, Andrea G. B. .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (12) :10376-10388