DisC Diversity: Result Diversification based on Dissimilarity and Coverage

被引:51
作者
Drosou, Marina [1 ]
Pitoura, Evaggelia [1 ]
机构
[1] Univ Ioannina, Dept Comp Sci, Ioannina, Greece
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2012年 / 6卷 / 01期
关键词
Computational complexity;
D O I
10.14778/2428536.2428538
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, result diversification has attracted a lot of attention as a means to improve the quality of results retrieved by user queries. In this paper, we propose a new, intuitive definition of diversity called DisC diversity. A DisC diverse subset of a query result contains objects such that each object in the result is represented by a similar object in the diverse subset and the objects in the diverse subset are dissimilar to each other. We show that locating a minimum DisC diverse subset is an NP-hard problem and provide heuristics for its approximation. We also propose adapting DisC diverse subsets to a different degree of diversification. We call this operation zooming. We present efficient implementations of our algorithms based on the M-tree, a spatial index structure, and experimentally evaluate their performance.
引用
收藏
页码:13 / 24
页数:12
相关论文
共 30 条
[1]  
Agrawal R., 2009, WSDM
[2]  
Angel A., 2011, SIGMOD
[3]  
[Anonymous], 2005, WWW
[4]  
Boim R., 2011, CIKM
[5]  
Borodin A., 2012, PODS
[6]  
Chlebik M., 2008, INF COMPUT, V206
[7]   UNIT DISK GRAPHS [J].
CLARK, BN ;
COLBOURN, CJ ;
JOHNSON, DS .
DISCRETE MATHEMATICS, 1990, 86 (1-3) :165-177
[8]  
Clarke CLA, 2008, SIGIR
[9]  
Drosou M., 2010, SIGMOD RECORD, V39
[10]  
Drosou M., 2012, EDBT