Algebraic query optimization for distributed top-k queries

被引:1
作者
Neumann, Thomas [1 ]
Michel, Sebastian [1 ]
机构
[1] Max Planck Inst Informat, Stuhlsatzenhausweg 85, D-66123 Saarbrucken, Germany
来源
COMPUTER SCIENCE-RESEARCH AND DEVELOPMENT | 2007年 / 21卷 / 3-4期
关键词
Query optimization; Top-k queries; Distributed systems;
D O I
10.1007/s00450-007-0024-2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Distributed top-k query processing is increasingly becoming an essential functionality in a large number of emerging application classes. This paper addresses the efficient algebraic optimization of top-k queries in wide-area distributed data repositories where the index lists for the attribute values (or text terms) of a query are distributed across a number of data peers and the computational costs include network latency, bandwidth consumption, and local peer work. We use a dynamic programming approach to find the optimal execution plan using compact data synopses for selectivity estimation that is the basis for our cost model. The optimized query is executed in a hierarchical way involving a small and fixed number of communication phases. We have performed experiments on real web data that show the benefits of distributed top-k query optimization both in network resource consumption and query response time.
引用
收藏
页码:197 / 211
页数:15
相关论文
共 59 条