Semantic similarity assessment of words using weighted WordNet

被引:29
作者
Ahsaee, Mostafa Ghazizadeh [1 ,2 ]
Naghibzadeh, Mahmoud [2 ]
Naeini, S. Ehsan Yasrebi [3 ]
机构
[1] Ferdowsi Univ Mashhad, Commun & Comp Res Ctr, Mashhad, Iran
[2] Ferdowsi Univ Mashhad, Dept Comp Engn, Mashhad, Iran
[3] Torbat e Heydariyeh Higher Educ Complex, Dept Comp Engn, Torbat Heydariyeh, Iran
关键词
Weighted WordNet; Semantic similarity; WordNet Hierarchy; Synset; Correlation;
D O I
10.1007/s13042-012-0135-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Word and concept similarity assessment is one of the most important elements in natural language processing and information and knowledge retrieval. WordNet, as a popular concept hierarchy, is used in many such applications. Similarity of words in WordNet is also considered in recent researches. Many researches that use WordNet, have calculated similarity between each pair-word by considering depth of subsumer of the words and shortest path between them. In this paper, three novel models to make better semantic word similarity measure have been presented and it was improved by giving weights to the edges of WordNet hierarchy. It was considered that the nearer an edge is to the root in the hierarchy, the less effect it has in calculating the similarity. Therefore, we have offered a new formula for weighting the edges of hierarchy and based on that, we calculated the distance between two words and depth of words; and then tuned parameters of the transfer functions using particle swarm optimization. Experimental results on a common benchmark, created by human judgment, show that the resultant correlation improved; furthermore our formulae were applied to a more realistic application called sentence similarity assessment and it led to the better results.
引用
收藏
页码:479 / 490
页数:12
相关论文
共 31 条
  • [1] Achananuparp P, 2008, LECT NOTES COMPUT SC, V5182, P305, DOI 10.1007/978-3-540-85836-2_29
  • [2] Altintas E, 2006, P 15 NODALIDA C, P8
  • [3] [Anonymous], 1997, P 10 RES COMPUTATION
  • [4] [Anonymous], P IEEE INT C FUZZ SY
  • [5] CHURCH KW, 1990, 27TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, P76
  • [6] Dou Hao, 2011, Proceedings of the 2011 Second International Conference on Digital Manufacturing and Automation (ICDMA 2011), P177, DOI 10.1109/ICDMA.2011.50
  • [7] Fellbaum C., 1998, WordNet, DOI DOI 10.7551/MITPRESS/7287.001.0001
  • [8] Ghazizadeh AM, 2010, USING WORDNET DETERI, P1019
  • [9] GREFENSTETTE G, 1992, SIGIR 92 : PROCEEDINGS OF THE FIFTEENTH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P89
  • [10] Haisheng Li, 2010, 2010 International Conference on Computer Application and System Modeling (ICCASM 2010), P408, DOI 10.1109/ICCASM.2010.5619038