Summary Graphs for Relational Database Schemas

被引:0
作者
Yang, Xiaoyan [1 ]
Procopiuc, Cecilia M. [2 ]
Srivastava, Divesh [2 ]
机构
[1] Natl Univ Singapore, Singapore 117543, Singapore
[2] AT&T Labs Res, Florham Pk, NJ 07932 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2011年 / 4卷 / 11期
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Increasingly complex databases need ever more sophisticated tools to help users understand their schemas and interact with the data. Existing tools fall short of either providing the "big picture," or of presenting useful connectivity information. In this paper we define summary graphs, a novel approach for summarizing schemas. Given a set of user-specified query tables, the summary graph automatically computes the most relevant tables and joins for that query set. The output preserves the most informative join paths between the query tables, while meeting size constraints. In the process, we define a novel information-theoretic measure over join edges. Unlike most subgraph extraction work, we allow metaedges (i.e., edges in the transitive closure) to help reduce output complexity. We prove that the problem is NP-Hard, and solve it as an integer program. Our extensive experimental study shows that our method returns high-quality summaries under independent quality measures.
引用
收藏
页码:899 / 910
页数:12
相关论文
共 21 条
[1]  
[Anonymous], 2009, PROC VLDB ENDOW
[2]   Fast algorithms for constructing t-spanners and paths with stretch t [J].
Cohen, E .
SIAM JOURNAL ON COMPUTING, 1998, 28 (01) :210-236
[3]  
Dasu T., 2002, SIGMOD, P240, DOI DOI 10.1145/564691.564719
[4]  
Faloutsos Christos, 2004, P KDD, P118, DOI DOI 10.1145/1014052.1014068
[5]  
He H, 2007, PROC ACM SIGMOD INT
[6]  
Jayapandian M., 2008, P 11 INT C EXT DAT T, P416, DOI DOI 10.1145/1353343.1353395
[7]  
Kang J., 2003, P 2003 ACM SIGMOD IN, P205
[8]  
Kasneci G, 2009, P 18 ACM C INF KNOWL, P1653, DOI DOI 10.1145/1645953.1646196
[9]  
Koren Y., 2007, TKDD, V1, p245
[10]  
Kraskov A, 2009, INFORM THEORY STAT L, P101