Comparing the performance of database selection algorithms

被引:34
作者
French, JC [1 ]
Powell, AL [1 ]
Callan, J [1 ]
Viles, CL [1 ]
Emmitt, T [1 ]
Prey, KJ [1 ]
Mou, Y [1 ]
机构
[1] Univ Virginia, Dept Comp Sci, Charlottesville, VA 22903 USA
来源
SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 1999年
关键词
D O I
10.1145/312624.312684
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We compare the performance of two database selection algorithms reported in the literature. Their performance is compared using a common testbed designed specifically for database selection techniques. The testbed is a decomposition of the TREC/TIPSTER data into 236 subcollections. The databases from our testbed were ranked using both the gGlOSS and CORI techniques and compared to a baseline derived from TREC relevance judgements. We examined the degree to which CORI and gGlOSS approximate this baseline. Our results confirm our earlier observation that the gGlOSS Ideal(l) ranks do not estimate relevance-based ranks well. We also find that CORI is a uniformly better estimator of relevance-based ranks than gGlOSS for the test environment used in this study. Part of the advantage of the CORI algorithm can be explained by a strong correlation between gGlOSS and a sire-based baseline (SBR). We also find that CORI produces consistently accurate rankings on testbeds ranging from 100-921 sites. However for a given level of recall, search effort appears to scale linearly with the number of databases.
引用
收藏
页码:238 / 245
页数:8
相关论文
共 25 条
[1]  
ALLAN J, P TREC 6
[2]  
[Anonymous], 1992, P 1 TEXT RETR C TREC
[3]  
[Anonymous], P 21 ANN INT ACM SIG
[4]  
[Anonymous], P 18 INT ACM SIGIR C
[5]   COMBINING THE EVIDENCE OF MULTIPLE QUERY REPRESENTATIONS FOR INFORMATION-RETRIEVAL [J].
BELKIN, NJ ;
KANTOR, P ;
FOX, EA ;
SHAW, JA .
INFORMATION PROCESSING & MANAGEMENT, 1995, 31 (03) :431-448
[6]  
BUCKLEY C, 1992, SMART VERSION 11 0
[7]  
Buckley C., 1994, P 2 TEXT RETR C TREC, P45
[8]  
BUCKLEY C, 1993, P 1 TEXT RETR C TREC, P59
[9]   TREC AND TIPSTER EXPERIMENTS WITH INQUERY [J].
CALLAN, JP ;
CROFT, WB ;
BROGLIO, J .
INFORMATION PROCESSING & MANAGEMENT, 1995, 31 (03) :327-343
[10]  
French J. C., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P121, DOI 10.1145/290941.290976