Measuring Semantic Similarity between Words Using Wikipedia

被引:11
|
作者
Lu Zhiqiang [1 ]
Shao Werimin [1 ]
Yu Zhenhua [2 ]
机构
[1] Shanghai Univ, Sch Engn & Comp Sci, Shanghai, Peoples R China
[2] Second Ltd Liabil Co, Shandong, Peoples R China
关键词
Text semantic similarity; wikipedia; TF-IDF; cosine similarity;
D O I
10.1109/WISM.2009.59
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic similarity measures play an important role in the extraction of semantic relations. Semantic similarity measures are widely used in Natural Language Processing (NLP) and information Retrieval (IR). This paper presents a new Web-based method for measuring the semantic similarity between words. Different from other methods which are based on taxonomy or Search engine in Internet, our method uses snippets from Wikipedia(1) to calculate the semantic similarity between words by using cosine similarity and TF-IDF. Also, the stemmer algorithm and stop words are used in preprocessing the snippets from Wikipedia. We set different threshold to evaluate our results in order to decrease the interference from noise and redundancy. Our method was empirically evaluated using Rubenstein-Goodenough benchmark dataset. It gives higher correlation value (with 0.615) than some existing methods. Evaluation results show that our method improves accuracy and more robust for measuring semantic similarity between words.
引用
收藏
页码:251 / +
页数:3
相关论文
共 50 条
  • [21] Measuring semantic relatedness using wikipedia signed network
    1600, Institute of Information Science (29):
  • [22] Clustering Source Code Elements by Semantic Similarity Using Wikipedia
    Schindler, Mirco
    Fox, Oliver
    Rausch, Andreas
    2015 IEEE/ACM FOURTH INTERNATIONAL WORKSHOP ON REALIZING ARTIFICIAL INTELLIGENCE SYNERGIES IN SOFTWARE ENGINEERING (RAISE 2015), 2015, : 13 - 18
  • [23] Fuzzy Semantic Similarity in Linked Data using Wikipedia Infobox
    Zadeh, Parisa D. Hossein
    Reformat, Marek Z.
    PROCEEDINGS OF THE 2013 JOINT IFSA WORLD CONGRESS AND NAFIPS ANNUAL MEETING (IFSA/NAFIPS), 2013, : 395 - 400
  • [24] Measuring Taxonomic Similarity between Words Using Restrictive Context Matrices
    Wang, Shi
    Cao, Cungen
    Cao, Ya-nan
    Lu, Han
    Cao, Xinyu
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 4, PROCEEDINGS, 2008, : 193 - 197
  • [25] Semantic Relatedness Measurement between Words based on Link Information of Wikipedia
    Wang, Rui-Qin
    2011 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION AND INDUSTRIAL APPLICATION (ICIA2011), VOL I, 2011, : 153 - 157
  • [26] Semantic Relatedness Measurement between Words based on Link Information of Wikipedia
    Wang, Rui-Qin
    2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL VI, 2010, : 155 - 159
  • [27] Measuring the Strength of the Semantic Relationship Between Words
    Stanchev, Lubornir
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2015, 24 (02)
  • [28] A Methodology for E-Content Preparation using Semantic Similarity between Words
    Gopal, U. Nanda
    2012 INTERNATIONAL CONFERENCE ON RADAR, COMMUNICATION AND COMPUTING (ICRCC), 2012, : 235 - 238
  • [29] Measuring similarity between trajectories using motion verbs in semantic level
    Cho, Miyoung
    Choi, Chang
    Kim, Pankoo
    9TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY: TOWARD NETWORK INNOVATION BEYOND EVOLUTION, VOLS 1-3, 2007, : 511 - +
  • [30] Measuring Semantic Similarity Between Sentences Using a Siamese Neural Network
    Ichida, Alexandre Yukio
    Meneguzzi, Felipe
    Ruiz, Duncan D.
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,