Evaluating information retrieval system performance based on user preference

被引:33
作者
Zhou, Bing [1 ]
Yao, Yiyu [1 ]
机构
[1] Univ Regina, Dept Comp Sci, Regina, SK S4S 0A2, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Multi-grade relevance; Evaluation methods; User preference;
D O I
10.1007/s10844-009-0096-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the challenges of modern information retrieval is to rank the most relevant documents at the top of the large system output. This calls for choosing the proper methods to evaluate the system performance. The traditional performance measures, such as precision and recall, are based on binary relevance judgment and are not appropriate for multi-grade relevance. The main objective of this paper is to propose a framework for system evaluation based on user preference of documents. It is shown that the notion of user preference is general and flexible for formally defining and interpreting multi-grade relevance. We review 12 evaluation methods and compare their similarities and differences. We find that the normalized distance performance measure is a good choice in terms of the sensitivity to document rank order and gives higher credits to systems for their ability to retrieve highly relevant documents.
引用
收藏
页码:227 / 248
页数:22
相关论文
共 47 条
  • [1] [Anonymous], P 26 ANN INT ACM SIG
  • [2] BOLLMANN P, 1987, SIGIR, P157
  • [3] Borda J.C.de., 1781, HIST ACAD ROYAL DES
  • [4] Buckley C., 2000, Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, P33, DOI DOI 10.1145/345508.345543
  • [5] OPTIMAL REFINEMENT OF THE RATING SCALE
    Champney, Horace
    Marshall, Helen
    [J]. JOURNAL OF APPLIED PSYCHOLOGY, 1939, 23 (03) : 323 - 331
  • [6] CLEVERDON C, 1966, FACTORS DERMNINING P
  • [7] Cleverdon C., 1962, Report on the testing and analysis of an investigation into the comparative efficiency of indexing systems. (a.k.a: ASLIB Cranfied Research Project)
  • [8] Cooper W.S., 1968, J AM SOC INFORM SCI, V19, P30
  • [10] Cuadra C.A., 1967, EXPT STUDIES RELEVAN