TuG Synopses for Approximate Query Answering

被引:12
作者
Spiegel, Joshua [1 ]
Polyzotis, Neoklis [2 ]
机构
[1] Oracle Corp, Redwood Shores, CA 94065 USA
[2] Univ Calif Santa Cruz, Dept Comp Sci, Santa Cruz, CA 95064 USA
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2009年 / 34卷 / 01期
关键词
Algorithms; Performance; Theory; Data synopses; approximate query processing; selectivity estimation;
D O I
10.1145/1508857.1508860
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article introduces the Tuple Graph (TuG) synopses, a new class of data summaries that enable accurate approximate answers for complex relational queries. The proposed summarization framework adopts a "semi-structured" view of the relational database, modeling a relational data set as a graph of tuples and join queries as graph traversals, respectively. The key idea is to approximate the structure of the induced data graph in a concise synopsis, and to approximate the answer to a query by performing the corresponding traversal over the summarized graph. We detail the (TuG) synopsis model that is based on this novel approach, and we describe an efficient and scalable construction algorithm for building accurate (TuG) within a specific storage budget. We validate the performance of (TuG) with an extensive experimental study on real-life and synthetic datasets. Our results verify the effectiveness of (TuG) in generating accurate approximate answers for complex join queries, and demonstrate their benefits over existing summarization techniques.
引用
收藏
页数:56
相关论文
共 23 条
[1]  
ABOULNAGA A, 1999, P ACM SIGMOD INT C M
[2]  
Acharya S, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P275, DOI 10.1145/304181.304207
[3]  
Agrawal R., 1998, Proc. of ACM SIGMOD, P94
[4]  
Alon N., 1996, Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing, P20, DOI 10.1145/237814.237823
[5]  
[Anonymous], P ACM SIGMOD INT C M
[6]  
[Anonymous], P 27 ANN ACM S THEOR
[7]  
[Anonymous], P LINKKDD WORKSH 10
[8]  
[Anonymous], 1996, P ACM SIGMOD INT C M
[9]  
[Anonymous], P ACM INT C MAN DAT
[10]  
[Anonymous], 2002, P 2002 ACM SIGMOD IN