On the Relationship Between Query Characteristics and IR Functions Retrieval Bias

被引:19
作者
Bashir, Shariq [1 ]
Rauber, Andreas [1 ]
机构
[1] Vienna Univ Technol, Inst Software Technol & Interact Syst, Vienna, Austria
来源
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY | 2011年 / 62卷 / 08期
关键词
INFORMATION;
D O I
10.1002/asi.21549
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Bias quantification of retrieval functions with the help of document retrievability scores has recently evolved as an important evaluation measure for recall-oriented retrieval applications. While numerous studies have evaluated retrieval bias of retrieval functions, solid validation of its impact on realistic types of queries is still limited. This is due to the lack of well-accepted criteria for query generation for estimating retrievability. Commonly, random queries are used for approximating documents retrievability due to the prohibitively large query space and time involved in processing all queries. Additionally, a cumulative retrievability score of documents over all queries is used for analyzing retrieval functions (retrieval) bias. However, this approach does not consider the difference between different query characteristics (QCs) and their influence on retrieval functions' bias quantification. This article provides an in-depth study of retrievability over different QCs. It analyzes the correlation of lower/higher retrieval bias with different query characteristics. The presence of strong correlation between retrieval bias and query characteristics in experiments indicates the possibility of determining retrieval bias of retrieval functions without processing an exhaustive query set. Experiments are validated on TREC Chemical Retrieval Track consisting of 1.2 million patent documents.
引用
收藏
页码:1515 / 1532
页数:18
相关论文
共 27 条
[1]  
[Anonymous], 2007, P 30 ANN INT ACM SIG
[2]  
[Anonymous], 2008, P 17 ACM C INFORM KN
[3]  
Azzopardi L., 2008, P 17 ACM C INF KNOWL, P561, DOI [DOI 10.1145/1458082.1458157, 10.1145/1458082.1458157]
[4]  
Azzopardi L, 2010, SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, P889
[5]   Search Engine Predilection towards News Media Providers [J].
Azzopardi, Leif ;
Owens, Ciaran .
PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, :774-775
[6]   Presentation Bias Is Significant in Determining User Preference for Search Results-A User Study [J].
Bar-Ilan, Judit ;
Keenoy, Kevin ;
Levene, Mark ;
Yaari, Eti .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2009, 60 (01) :135-149
[7]  
Bashir S., 2009, P 18 ACM C INF KNOWL, P1863, DOI 10.1145/1645953.1646250
[8]  
Bashir S, 2010, LECT NOTES COMPUT SC, V5993, P457, DOI 10.1007/978-3-642-12275-0_40
[9]  
Bashir S, 2009, LECT NOTES COMPUT SC, V5690, P753
[10]  
Cronen-Townsend S., 2002, Proceedings of SIGIR 2002. Twenty-Fifth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P299