Document-based approach to improve the accuracy of pairwise comparison in evaluating information retrieval systems

被引:1
作者
Ravana, Sri Devi [1 ]
Taheri, Masumeh Sadat [1 ]
Rajagopal, Prabha [1 ]
机构
[1] Univ Malaya, Dept Informat Syst, Kuala Lumpur, Malaysia
关键词
Information retrieval; Document-based evaluation; Information retrieval evaluation; Pairwise comparison; Significance test; SIGN TEST; TIES;
D O I
10.1108/AJIM-12-2014-0171
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Purpose - The purpose of this paper is to propose a method to have more accurate results in comparing performance of the paired information retrieval (IR) systems with reference to the current method, which is based on the mean effectiveness scores of the systems across a set of identified topics/queries. Design/methodology/approach - Based on the proposed approach, instead of the classic method of using a set of topic scores, the documents level scores are considered as the evaluation unit. These document scores are the defined document's weight, which play the role of the mean average precision (MAP) score of the systems as a significance test's statics. The experiments were conducted using the TREC 9 Web track collection. Findings - The p-values generated through the two types of significance tests, namely the Student's t-test and Mann-Whitney show that by using the document level scores as an evaluation unit, the difference between IR systems is more significant compared with utilizing topic scores. Originality/value - Utilizing a suitable test collection is a primary prerequisite for IR systems comparative evaluation. However, in addition to reusable test collections, having an accurate statistical testing is a necessity for these evaluations. The findings of this study will assist IR researchers to evaluate their retrieval systems and algorithms more accurately.
引用
收藏
页码:408 / 421
页数:14
相关论文
共 50 条
  • [31] Effective Retrieval Of Related Documents Based On Spelling Correction To Improve Information Retrieval System
    Houtinezhad, Maryam
    Ghaffary, Hamid Reza
    2018 3RD CONFERENCE ON SWARM INTELLIGENCE AND EVOLUTIONARY COMPUTATION (CSIEC2018), VOL 3, 2018, : 37 - 42
  • [32] Topic Model based Approach for Improved Indexing in Content based Document Retrieval
    Cha, Moon Soo
    Kim, So Yeon
    Ha, Jae Hee
    Lee, Min-June
    Choi, Young-June
    Sohn, Kyung-Ah
    INTERNATIONAL JOURNAL OF NETWORKED AND DISTRIBUTED COMPUTING, 2016, 4 (01) : 55 - 64
  • [33] Weighted PCA for Improving Document Image Retrieval System Based On Keyword Spotting Accuracy
    Tavoli, Reza
    Kozegar, Ehsan
    Shojafar, Mohammad
    Soleimani, Hossein
    Pooranian, Zahra
    2013 36TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2013, : 773 - 777
  • [34] Document management systems from current capabilities towards intelligent information retrieval: an overview
    Zantout, H
    Marir, F
    INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 1999, 19 (06) : 471 - 484
  • [35] A Purely Entity-Based Semantic Search Approach for Document Retrieval
    Sidi, Mohamed Lemine
    Gunal, Serkan
    APPLIED SCIENCES-BASEL, 2023, 13 (18):
  • [36] A Hash-Based Approach for Document Retrieval by Utilizing Term Features
    Gupta, Rajeev Kumar
    Patel, Durga
    Bramhe, Ankit
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, 2019, 711 : 617 - 627
  • [37] Intelligent retrieval method of library document information based on hidden topic mining
    An, Yujie
    Yan, Yuwei
    WEB INTELLIGENCE, 2022, 20 (02) : 93 - 102
  • [38] An Information Retrieval Based Approach for Multilingual Ontology Matching
    Rexha, Andi
    Dragoni, Mauro
    Kern, Roman
    Kroell, Mark
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2016, 2016, 9612 : 433 - 439
  • [39] CI-SNF: Exploiting contextual information to improve SNF based information retrieval
    Chen, Ning
    INFORMATION FUSION, 2019, 52 : 175 - 186
  • [40] A merge-based clustering approach for information retrieval
    Lee, WJ
    Chung, JS
    Lee, SJ
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS: COMPUTER SCIENCE AND ENGINEERING, 2003, : 250 - 255