sMARE: a new paradigm to evaluate and understand query performance prediction methods

被引:12
作者
Faggioli, Guglielmo [1 ]
Zendel, Oleg [2 ]
Culpepper, J. Shane [2 ]
Ferro, Nicola [1 ]
Scholer, Falk [2 ]
机构
[1] Univ Padua, Padua, Italy
[2] RMIT Univ, Melbourne, Vic, Australia
来源
INFORMATION RETRIEVAL JOURNAL | 2022年 / 25卷 / 02期
基金
澳大利亚研究理事会;
关键词
Query performance prediction; Systems evaluation; Analysis of variance; Query formulations; Information retrieval;
D O I
10.1007/s10791-022-09407-w
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Query performance prediction (QPP) has been studied extensively in the IR community over the last two decades. A by-product of this research is a methodology to evaluate the effectiveness of QPP techniques. In this paper, we re-examine the existing evaluation methodology commonly used for QPP, and propose a new approach. Our key idea is to model QPP performance as a distribution instead of relying on point estimates. To obtain such distribution, we exploit the scaled Absolute Ranking Error (sARE) measure, and its mean the scaled Mean Absolute Ranking Error (sMARE). Our work demonstrates important statistical implications, and overcomes key limitations imposed by the currently used correlation-based point-estimate evaluation approaches. We also explore the potential benefits of using multiple query formulations and ANalysis Of VAriance (ANOVA) modeling in order to measure interactions between multiple factors. The resulting statistical analysis combined with a novel evaluation framework demonstrates the merits of modeling QPP performance as distributions, and enables detailed statistical ANOVA models for comparative analyses to be created.
引用
收藏
页码:94 / 122
页数:29
相关论文
共 59 条
  • [1] Amati G, 2004, LECT NOTES COMPUT SC, V2997, P127
  • [2] [Anonymous], 2007, P 16 ACM CIKM
  • [3] Aslam JA, 2007, LECT NOTES COMPUT SC, V4425, P198
  • [4] Retrieval Consistency in the Presence of Qery Variations
    Bailey, Peter
    Moffat, Alistair
    Scholer, Falk
    Thomas, Paul
    [J]. SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 395 - 404
  • [5] UQV100: A Test Collection with Query Variability
    Bailey, Peter
    Moffat, Alistair
    Scholer, Falk
    Thomas, Paul
    [J]. SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, : 725 - 728
  • [6] Blind Men and Elephants: Six Approaches to TREC data
    David Banks
    Paul Over
    Nien-Fan Zhang
    [J]. Information Retrieval, 1999, 1 (1-2): : 7 - 34
  • [7] Benham R., 2017, Proceedings of the 22nd australasian document computing symposium, P1
  • [8] Boosting Search Performance Using Query Variations
    Benham, Rodger
    Mackenzie, Joel
    Moffat, Alistair
    Culpepper, J. Shane
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2019, 37 (04)
  • [9] Carmel D., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P390, DOI 10.1145/1148170.1148238
  • [10] Carmel D., 2010, Estimating the query difficulty for information retrieval