Comparing the performance of collection selection algorithms

被引:30
作者
Powell, AL [1 ]
French, JC [1 ]
机构
[1] Univ Virginia, Dept Comp Sci, Charlottesville, VA USA
关键词
experimentation; measurement; performance; collection selection; distributed information retrieval; database selection; distributed text retrieval; metasearch engine; resource discovery; resource ranking; resource selection; server selection; server ranking; text retrieval;
D O I
10.1145/944012.944016
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The proliferation of online information resources increases the importance of effective and efficient information retrieval in a multicollection environment. Multicollection searching is cast in three parts: collection selection ( also referred to as database selection), query processing and results merging. In this work, we focus our attention on the evaluation of the first step, collection selection. In this article, we present a detailed discussion of the methodology that we used to evaluate and compare collection selection approaches, covering both test environments and evaluation measures. We compare the CORI, CVV and gGlOSS collection selection approaches using six test environments utilizing three document testbeds. We note similar trends in performance among the collection selection approaches, but the CORI approach consistently outperforms the other approaches, suggesting that effective collection selection can be achieved using limited information about each collection. The contributions of this work are both the assembled evaluation methodology as well as the application of that methodology to compare collection selection approaches in a standardized environment.
引用
收藏
页码:412 / 456
页数:45
相关论文
共 54 条
[1]  
ABDULLA G, 1997, TR9704 VIRG POL I ST
[2]  
[Anonymous], P 21 ANN INT ACM SIG
[3]  
[Anonymous], P 18 INT ACM SIGIR C
[4]  
[Anonymous], NIST SPECIAL PUBLICA
[5]   A probabilistic solution to the selection and fusion problem in distributed information retrieval [J].
Baumgarten, C .
SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 1999, :246-253
[6]  
Baumgarten C, 1997, PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P258, DOI 10.1145/278459.258585
[7]  
BUCKLEY C, 1992, SMART VERSION 11 0
[8]  
Callan J, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P479, DOI 10.1145/304181.304224
[9]  
CALLAN J, 2000, CMULTI00162 SCH COMP
[10]  
Callan J. P., 1992, DEXA 92. Database and Expert Systems Applications. Proceedings of the International Conference, P78