Semantically-enhanced information retrieval using multiple knowledge sources

被引:12
作者
Jiang, Yuncheng [1 ]
机构
[1] South China Normal Univ, Sch Comp Sci, Guangzhou 510631, Peoples R China
来源
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2020年 / 23卷 / 04期
基金
中国国家自然科学基金;
关键词
Information retrieval; Keyword search; Semantic relatedness; Multiple knowledge sources; WORD SENSE DISAMBIGUATION; LINKED DATA; SEARCH; ONTOLOGY; WEB; SIMILARITY; WIKIPEDIA; CONSTRUCTION; POINT;
D O I
10.1007/s10586-020-03057-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Classical or traditional Information Retrieval (IR) approaches rely on the word-based representations of query and documents in the collection. The specification of the user information need is completely based on words figuring in the original query in order to retrieve documents containing those words. Such approaches have been limited due to the absence of relevant keywords as well as the term variation in documents and user's query. The purpose of this paper is to present a new method to Semantic Information Retrieval (SIR) to solve the limitations of existing approaches. Concretely, we propose a novel method SIRWWO (Semantic Information Retrieval using Wikipedia, WordNet, and domain Ontologies) for SIR by combining multiple knowledge sources Wikipedia, WordNet, and Description Logic (DL) ontologies. In order to illustrate the approach SIRWWO, we first present the notion of Labeled Dynamic Semantic Network (LDSN) by extending the notions of dynamic semantic network and extended semantic net based on WordNet (and DAML ontology library). According to the notion of LDSN, we obtain the notion of Weighted Dynamic Semantic Network (WDSN, intuitively, each edge in WDSN is assigned to a number in the [0, 1] interval) and give the WDSN construction method using Wikipedia, WordNet, and DL ontology. We then propose a novel metric to measure the semantic relatedness between concepts based on WDSN. Lastly, we investigate the approach SIRWWO by using semantic relatedness between users' query keywords and digital documents. The experimental results show that our proposals obtain comparable and better performance results than other traditional IR system Lucene.
引用
收藏
页码:2925 / 2944
页数:20
相关论文
共 68 条
  • [1] [Anonymous], 1998, WORDNET ELECT LEXICA, DOI DOI 10.7551/MITPRESS/7287.001.0001
  • [2] Anyanwu K., 2005, P 14 INT C WORLD WID, P117, DOI [10.1145/1060745.1060766, DOI 10.1145/1060745.1060766]
  • [3] Baader F., 2007, The Description Logic Handbook - Theory, Implementation, and Applications
  • [4] The Semantic Web - A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities
    Berners-Lee, T
    Hendler, J
    Lassila, O
    [J]. SCIENTIFIC AMERICAN, 2001, 284 (05) : 34 - +
  • [5] DBpedia - A crystallization point for the Web of Data
    Bizer, Christian
    Lehmann, Jens
    Kobilarov, Georgi
    Auer, Soeren
    Becker, Christian
    Cyganiak, Richard
    Hellmann, Sebastian
    [J]. JOURNAL OF WEB SEMANTICS, 2009, 7 (03): : 154 - 165
  • [6] Repeatable and reliable semantic search evaluation
    Blanco, Roi
    Halpin, Harry
    Herzig, Daniel M.
    Mika, Peter
    Pound, Jeffrey
    Thompson, Henry S.
    Thanh Tran
    [J]. JOURNAL OF WEB SEMANTICS, 2013, 21 : 14 - 29
  • [7] Representation of context-dependant knowledge in ontologies:: A model and an application
    Bobillo, Fernando
    Delgado, Miguel
    Gomez-Romero, Juan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2008, 35 (04) : 1899 - 1908
  • [8] Fuzzy description logics under Godel semantics
    Bobillo, Fernando
    Delgado, Miguel
    Gomez-Romero, Juan
    Straccia, Umberto
    [J]. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2009, 50 (03) : 494 - 514
  • [9] Semantic Web computing in industry
    Breslin, John G.
    O'Sullivan, David
    Passant, Alexandre
    Vasiliu, Laurentiu
    [J]. COMPUTERS IN INDUSTRY, 2010, 61 (08) : 729 - 741
  • [10] A Survey of Automatic Query Expansion in Information Retrieval
    Carpineto, Claudio
    Romano, Giovanni
    [J]. ACM COMPUTING SURVEYS, 2012, 44 (01)