Evaluating information retrieval system performance based on user preference

被引:0
作者
Bing Zhou
Yiyu Yao
机构
[1] University of Regina,Department of Computer Science
来源
Journal of Intelligent Information Systems | 2010年 / 34卷
关键词
Multi-grade relevance; Evaluation methods; User preference;
D O I
暂无
中图分类号
学科分类号
摘要
One of the challenges of modern information retrieval is to rank the most relevant documents at the top of the large system output. This calls for choosing the proper methods to evaluate the system performance. The traditional performance measures, such as precision and recall, are based on binary relevance judgment and are not appropriate for multi-grade relevance. The main objective of this paper is to propose a framework for system evaluation based on user preference of documents. It is shown that the notion of user preference is general and flexible for formally defining and interpreting multi-grade relevance. We review 12 evaluation methods and compare their similarities and differences. We find that the normalized distance performance measure is a good choice in terms of the sensitivity to document rank order and gives higher credits to systems for their ability to retrieve highly relevant documents.
引用
收藏
页码:227 / 248
页数:21
相关论文
共 30 条
[1]  
Champney H(1939)Optimal refinement of the rating scale Journal of Applied Psychology 23 323-331
[2]  
Marshall H(1968)Expected search length: A single measure of retrieval effectiveness based on weak ordering action of retrieval systems Journal of the American Society for Information Science 19 30-41
[3]  
Cooper WS(1988)Measuring relevance judgments Information Processing and Management 24 373-389
[4]  
Eisenberg M(1991)Determine the effectiveness of retrieval algorithms Information Processing and Management 27 153-164
[5]  
Frei HP(1989)Optimum polynomial retrieval functions based on probability ranking principle ACM Transactions on Information System 3 183-204
[6]  
Schsuble P(1971)Three point likert scales are good enough Journal of Marketing Research 8 495-500
[7]  
Fuhr N(2002)Cumulated gain-based evaluation of IR techniques ACM Transactions on Information Systems 20 422-446
[8]  
Jacoby J(1968)The influence of scale form on relevance judgments Information Storage and Retrieval 4 1-11
[9]  
Matell MS(1938)A new measure of rank correlation Biometrika 30 81-89
[10]  
Jarvelin K(1945)The treatment of ties in rank problems Biometrika 33 239-251