Efficient top-k query evaluation on probabilistic data

被引:0
作者
Re, Christopher [1 ]
Dalvi, Nilesh [1 ]
Suciu, Dan [1 ]
机构
[1] Univ Washington, Dept Comp Sci, Seattle, WA 98195 USA
来源
2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3 | 2007年
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Modem enterprise applications are forced to deal with unreliable, inconsistent and imprecise information. Probabilistic databases can model such data naturally, but SQL query evaluation on probabilistic databases is difficult: previous approaches have either restricted the SQL queries, or computed approximate probabilities, or did not scale, and it was shown recently that precise query evaluation is theoretically hard. In this paper we describe a novel approach, which computes and ranks efficiently the top-k answers to a SQL query on a probabilistic database. The restriction to top-k answers is natural, since imprecisions in the data often lead to a large number of answers of low quality, and users are interested only in the answers with the highest probabilities. The idea in our algorithm is to run in parallel several Monte-Carlo simulations, one for each candidate answer, and approximate each probability only to the extent needed to compute correctly the top-k answers.
引用
收藏
页码:861 / +
页数:2
相关论文
共 23 条
[1]  
ANANTHAKRISHNA R, 2002, VLDB, P586
[2]  
[Anonymous], SOCIAL SEMIOTICS
[3]  
[Anonymous], 2004, VLDB
[4]   THE MANAGEMENT OF PROBABILISTIC DATA [J].
BARBARA, D ;
GARCIAMOLINA, H ;
PORTER, D .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1992, 4 (05) :487-502
[5]   New developments in ranking and selection: An empirical comparison of the three main approaches [J].
Branke, J ;
Chick, SE ;
Schmidt, C .
PROCEEDINGS OF THE 2005 WINTER SIMULATION CONFERENCE, VOLS 1-4, 2005, :708-717
[6]  
CHAUDHURI S, 2003, ACM SIGMOD
[7]  
CHENG R, 2003, SIGMOD, P551
[8]  
Das Sarma A., 2006, ICDE
[9]  
DESHPANDE A, 2004, VLDB, P588
[10]   A THEORY FOR RECORD LINKAGE [J].
FELLEGI, IP ;
SUNTER, AB .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1969, 64 (328) :1183-&