Information content measures of semantic similarity between documents based on Hadoop system

被引:0
|
作者
Birjali, Marouane [1 ]
Beni-Hssane, Abderrahim [1 ]
Erritali, Mohammed [2 ]
Madani, Youness [2 ]
机构
[1] Univ Chouaib Doukkali, Fac Sci, Dept Comp Sci, El Jadida, Morocco
[2] Univ Sultan Moulay Slimane, Fac Sci & Technol, Dept Comp Sci, Beni Mellal, Morocco
来源
2016 INTERNATIONAL CONFERENCE ON WIRELESS NETWORKS AND MOBILE COMMUNICATIONS (WINCOM) | 2016年
关键词
distributed processing; Hadoop; Big Data; Semantic similarity; Mapreduce programming; Wordnet;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Retrieving documents in response to the user's query is the most commonly text retrieval task. For our work, we have mainly focused on detecting the semantic similarity between documents in large documents collection and queries. In this paper, we investigated MapReduce as a specific framework for managing distributed processing in dataset pattern and semantic similarity measures of documents. Then we study the state of the art of different approaches for computing the semantic similarity of documents. We propose an approach based on parallel algorithm of semantic similarity measures using MapReduce and WordNet to detect the relevant documents in the face of the query. Finally, we are leading basic experiments to assess the performance of the proposed approach and noted the leverage of Hadoop and MapReduce to the semantic similarity measures between documents.
引用
收藏
页码:P187 / P192
页数:6
相关论文
共 50 条
  • [31] Evaluation and analysis of similarity measures for content-based visual information retrieval
    Eidenberger, Horst
    MULTIMEDIA SYSTEMS, 2006, 12 (02) : 71 - 87
  • [32] A methodology for semantic similarity measurement among metadata based information system
    Lim, Jung-Eun
    Choi, O-Hoon
    Na, Hong-Seok
    Baik, Doo-Kwon
    Fourth International Conference on Software Engineering Research, Management and Applications, Proceedings, 2006, : 202 - 205
  • [33] SISR: System for integrating semantic relatedness and similarity measures
    Ben Aouicha, Mohamed
    Taieb, Mohamed Ali Hadj
    Ben Hamadou, Abdelmajid
    SOFT COMPUTING, 2018, 22 (06) : 1855 - 1879
  • [34] Semantic Similarity Measures for the Development of Thai Dialog System
    Osathanunkul, Khukrit
    O'Shea, James
    Bandar, Zuhair
    Crockett, Keeley
    AGENT AND MULTI-AGENT SYSTEMS: TECHNOLOGIES AND APPLICATIONS, 2011, 6682 : 544 - 552
  • [35] SISR: System for integrating semantic relatedness and similarity measures
    Mohamed Ben Aouicha
    Mohamed Ali Hadj Taieb
    Abdelmajid Ben Hamadou
    Soft Computing, 2018, 22 : 1855 - 1879
  • [36] Leveraging synonymy and polysemy to improve semantic similarity assessments based on intrinsic information content
    Batet, Montserrat
    Sanchez, David
    ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (03) : 2023 - 2041
  • [37] Leveraging synonymy and polysemy to improve semantic similarity assessments based on intrinsic information content
    Montserrat Batet
    David Sánchez
    Artificial Intelligence Review, 2020, 53 : 2023 - 2041
  • [38] A Taxonomy based Semantic Similarity of Documents using the Cosine Measure
    Madylova, Ainura
    Oguducu, Sule Guenduez
    2009 24TH INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2009, : 129 - 134
  • [39] Metric of Intrinsic Information Content for Measuring Semantic Similarity in an Ontology
    Seddiqui, Md. Hanif
    Aono, Masaki
    CONCEPTUAL MODELLING 2010, 2010, : 89 - 96
  • [40] A semantic similarity metric combining features and intrinsic information content
    Pirro, Giuseppe
    DATA & KNOWLEDGE ENGINEERING, 2009, 68 (11) : 1289 - 1308