A Mean-Variance Analysis Based Approach for Search Result Diversification in Federated Search

被引:3
作者
Ghansah, Benjamin [1 ]
Wu, Shengli [1 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Commun Engn, 301 Xuefu Rd, Zhenjiang 212013, Jiangsu, Peoples R China
关键词
Distributed information retrieval; federated search; resource selection; diversification; mean-variance analysis;
D O I
10.1142/S0218488516500100
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Resource Selection is an important step in a federated search environment. The goal of this work was to improve the collection selection process by selecting collections in terms of relevance and diversity, to best answer a user's query. Sampled documents from the Central Sample Database are first ranked by Indri retrieval algorithm and later re-ranked by a Mean-Standard deviation method that reduces uncertainty and improves diversity of collection sources. A comparative evaluation with the R-based diversification metrics shows that the proposed method significantly outperforms the baseline diversification methods; ReDDE+MMR, ReDDE+MAP-IA and state-of-the-art resource selection methods (ReDDE and CORI) in all metrics.
引用
收藏
页码:195 / 211
页数:17
相关论文
共 38 条
[1]  
AGRAWAL R, 2009, P 2 ACM INT C WEB SE
[2]  
[Anonymous], [No title captured]
[3]  
Arguello J., 2009, P 18 ACM C INF KNOWL
[4]   Query-based sampling of text databases [J].
Callan, J ;
Connell, M .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2001, 19 (02) :97-130
[5]  
CALLAN J, 1995, P 18 ANN INT ACM SIG
[6]  
Callan J., 2000, ADV INFORM RETRIEVAL, P127
[7]  
Carbonell J. G., 1998, P 21 ANN INT ACM SIG
[8]  
Chapelle O, 2009, P 18 ACM C INF KNOWL
[9]  
CHEN H, 2006, P 29 ANN INT ACM SIG
[10]  
Clarke C., 2008, P 31 ANN INT ACM SIG