Meta-Analysis for Retrieval Experiments Involving Multiple Test Collections

被引:8
作者
Soboroff, Ian [1 ]
机构
[1] NIST, Gaithersburg, MD 20899 USA
来源
CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT | 2018年
关键词
meta-analysis; score standardization; effect sizes;
D O I
10.1145/3269206.3271719
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Traditional practice recommends that information retrieval experiments be run over multiple test collections, to support, if not prove, that gains in performance are likely to generalize to other collections or tasks. However, because of the pooling assumptions, evaluation scores are not directly comparable across different test collections. We present a widely-used statistical tool, meta-analysis, as a framework for reporting results from IR experiments using multiple test collections. We demonstrate the meta-analytical approach through two standard experiments on stemming and pseudo-relevance feedback, and compare the results to those obtained from score standardization. Meta-analysis incorporates several recent recommendations in the literature, including score standardization, reporting effect sizes rather than score differences, and avoiding a reliance on null-hypothesis statistical testing, in a unified approach. It therefore represents an important methodological improvement over using these techniques in isolation.
引用
收藏
页码:713 / 722
页数:10
相关论文
共 25 条
[1]  
[Anonymous], 2005, TREC: Experiment and Evaluation in Information Retrieval. en. Digital Libraries and Electronic Publishing
[2]  
[Anonymous], TREC 6
[3]   SIGIR 2015 Workshop on Reproducibility, Inexplicability, and Generalizability of Results (RIGOR) [J].
Arguello, Jaime ;
Diaz, Fernando ;
Lin, Jimmy ;
Trotman, Andrew .
SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, :1147-1148
[4]  
Armstrong T.G., 2009, P 18 ACM C INFORM KN, P601, DOI [10.1145/1645953.1646031, DOI 10.1145/1645953, DOI 10.1145/1645953.1646031]
[5]  
Borenstein M., 2009, Introduction to meta-analysis, DOI DOI 10.1002/9781119558378
[6]  
Buckley C., 2000, Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P33, DOI DOI 10.1145/345508.345543
[7]   Why current IR engines fail [J].
Buckley, Chris .
INFORMATION RETRIEVAL, 2009, 12 (06) :652-665
[8]  
Cormack G. V., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P533, DOI 10.1145/1148170.1148262
[9]   Meta-analysis in clinical trials revisited [J].
DerSimonian, Rebecca ;
Laird, Nan .
CONTEMPORARY CLINICAL TRIALS, 2015, 45 :139-145
[10]  
Fagan J. L., 1987, Proceedings of the Tenth Annual International ACMSIGIR Conference on Research and Development in Information Retrieval, P91, DOI 10.1145/42005.42016