On the reliability of information retrieval metrics based on graded relevance

被引:59
作者
Sakai, Tetsuya [1 ]
机构
[1] Toshiba Corp R&D Ctr, Knowledge Media Lab, Saiwai Ku, Kawasaki, Kanagawa 2128582, Japan
关键词
evaluation; reliability; graded relevance; Q-measure; cumulative gain;
D O I
10.1016/j.ipm.2006.07.020
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper compares 14 information retrieval metrics based on graded relevance, together with 10 traditional metrics based on binary relevance, in terms of stability, sensitivity and resemblance of system rankings. More specifically, we compare these metrics using the Buckley/Voorhees stability method, the Voorhees/Buckley swap method and Kendall's rank correlation, with three data sets comprising test collections and submitted runs from NTCIR. Our experiments show that (Average) Normalised Discounted Cumulative Gain at document cut-off I are the best among the rank-based graded-relevance metrics, provided that l is large. On the other hand, if one requires a recall-based graded-relevance metric that is highly correlated with Average Precision, then Q-measure is the best choice. Moreover, these best graded-relevance metrics are at least as stable and sensitive as Average Precision, and are fairly robust to the choice of gain values. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:531 / 548
页数:18
相关论文
共 19 条
[1]  
[Anonymous], P 16 ANN INT ACM SIG
[2]  
[Anonymous], P AS INF RETR S 2004
[3]  
Buckley C., 2004, Proceedings of Sheffield SIGIR 2004. The Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P25, DOI 10.1145/1008992.1009000
[4]  
BUCKLEY C, 2003, P 23 ANN INT ACM SIG, P33
[5]  
CHEN KH, 2003, P 3 NTCIR WORKSH RES
[6]   Measuring retrieval effectiveness: A new proposal and a first experimental validation [J].
Della Mea, V ;
Mizzaro, S .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2004, 55 (06) :530-543
[7]   Cumulated gain-based evaluation of IR techniques [J].
Järvelin, K ;
Kekäläinen, J .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2002, 20 (04) :422-446
[8]  
Kekäläinen J, 2005, INFORM PROCESS MANAG, V41, P1019, DOI 10.1016/j.ipm.2004.01.004
[9]  
KORFHAGE RR, 1997, INFORMATION STORAGE
[10]  
Sakai T, 2005, LECT NOTES COMPUT SC, V3689, P1