PREFCA: A portal retrieval engine based on formal concept analysis

被引:15
|
作者
Negm, Eman [1 ]
AbdelRahman, Samir [1 ]
Bahgat, Reem [1 ]
机构
[1] Cairo Univ, Fac Comp & Informat, Dept Comp Sci, Giza, Egypt
关键词
Information retrieval; Formal concept analysis; Network analysis; Portal retrieval; INFORMATION-RETRIEVAL; CONCEPT LATTICES; TEXT RETRIEVAL; WEB; ALGORITHMS;
D O I
10.1016/j.ipm.2016.08.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The web is a network of linked sites whereby each site either forms a physical portal or a standalone page. In the former case, the portal presents an access point to its embedded web pages that coherently present a specific topic. In the latter case, there are millions of standalone web pages, that are scattered throughout the web, having the same topic and could be conceptually linked together to form virtual portals. Search engines have been developed to help users in reaching the adequate pages in an efficient and effective manner. All the known current search engine techniques rely on the web page as the basic atomic search unit. They ignore the conceptual links, that reveal the implicit web related meanings, among the retrieved pages. However, building a semantic model for the whole portal may contain more semantic information than a model of scattered individual pages. In addition, user queries can be poor and contain imprecise terms that do not reflect the real user intention. Consequently, retrieving the standalone individual pages that are directly related to the query may not satisfy the user's need. In this paper, we propose PREFCA, a Portal Retrieval Engine based on Formal Concept Analysis that relies on the portal as the main search unit. PREFCA consists of three phases: First, the information extraction phase that is concerned with extracting portal's semantic data. Second, the formal concept analysis phase that utilizes formal concept analysis to discover the conceptual links among portal and attributes. Finally, the information retrieval phase where we propose a portal ranking method to retrieve ranked pairs of portals and embedded pages. Additionally, we apply the network analysis rules to output some portal characteristics. We evaluated PREFCA using two data sets, namely the Forum for Information Retrieval Evaluation 2010 and ClueWeb09 (category B) test data, for physical and virtual portals respectively. PREFCA proves higher F-measure accuracy, better Mean Average Precision ranking and comparable network analysis and efficiency results than other search engine approaches, namely Term Frequency Inverse Document Frequency (TF-IDF), Latent Semantic Analysis (LSA), and BM25 techniques. As well, it gains high Mean Average Precision in comparison with learning to rank techniques. Moreover, PREFCA also gains better reach time than Carrot as a well-known topic-based search engine. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:203 / 222
页数:20
相关论文
共 50 条
  • [1] Information Retrieval Based on Formal Concept Analysis
    Zhi Dongjie
    PROCEEDINGS OF THE FOURTH INTERNATIONAL SYMPOSIUM ON EDUCATION MANAGEMENT AND KNOWLEDGE INNOVATION ENGINEERING, VOLS 1 AND 2, 2011, : 741 - 745
  • [2] Adaptation guided retrieval based on formal concept analysis
    Díaz-Agudo, B
    Gervás, P
    González-Calero, PA
    CASE-BASED REASONING RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2003, 2689 : 131 - 145
  • [3] Classification based retrieval using Formal Concept Analysis
    Díaz-Agudo, B
    González-Calero, PA
    CASE-BASED REASONING RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2001, 2080 : 173 - 188
  • [4] Research on Image Retrieval Based on Formal Concept Analysis
    Wang, Xiaomin
    Liu, Jianbo
    Zhang, Yanyan
    Huang, Lihua
    2015 27TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2015, : 3290 - 3295
  • [5] Intelligent search engine based on Formal Concept Analysis
    Shen, Xiajiong
    Xu, Yan
    Yu, Junyang
    Zhang, Ke
    GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, : 669 - 674
  • [6] Management and retrieval of Web services based on formal concept analysis
    Peng, DL
    Huang, S
    Wang, XL
    Zhou, AY
    FIFTH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY - PROCEEDINGS, 2005, : 269 - 275
  • [7] Formal Concept Analysis and Information Retrieval - A Survey
    Codocedo, Victor
    Napoli, Amedeo
    FORMAL CONCEPT ANALYSIS (ICFCA 2015), 2015, 9113 : 61 - 77
  • [8] Concept Location Using Formal Concept Analysis and Information Retrieval
    Poshyvanyk, Denys
    Gethers, Malcom
    Marcus, Andrian
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2012, 21 (04)
  • [9] IRAFCA: an O(n) information retrieval algorithm based on formal concept analysis
    Fkih, Fethi
    Omri, Mohamed Nazih
    KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 48 (02) : 465 - 491
  • [10] Text-based Image Indexing and Retrieval using Formal Concept Analysis
    Ahmad, Imran Shafiq
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2008, 2 (03): : 150 - 170