Exploiting local similarity for indexing paths in graph-structured data

被引:92
作者
Kaushik, R [1 ]
Shenoy, P [1 ]
Bohannon, P [1 ]
Gudes, E [1 ]
机构
[1] Univ Wisconsin, Madison, WI 53706 USA
来源
18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS | 2002年
关键词
D O I
10.1109/ICDE.2002.994703
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
XML and other semi-structured data may have partially specified or missing schema information, motivating the use of a structural summary which can be automatically computed front the data. These summaries also serve as indices for evaluating the complex path expressions common to XML and semi-structured query languages. However, to answer all path queries accurately, summaries must encode information about long, seldom-queried paths, leading to increased size and complexity with little added value. We introduce the A(k)-indices, a family of approximate structural summaries. They are based on the concept of k-bisimilarity, in which nodes are grouped based on local structure, i.e., the incoming paths of length up to k. The parameter k thus smoothly varies the level of detail (and accuracy) of the A(k)-index. For small values of k, the size of the index is substantially reduced. While smaller, the A(k) index is approximate, and we describe techniques for efficiently extracting exact answers to regular path queries. Our experiments show that, for moderate values of k, path evaluation using the A(k)-index ranges from being very efficient for simple queries to competitive for most complex queries, while using significantly less space than comparable structures.
引用
收藏
页码:129 / 140
页数:12
相关论文
共 31 条
[1]  
Abiteboul S., 1997, INT C DAT THEOR ICDT
[2]  
Abiteboul S., 1999, DATA WEB RELATIONS S
[3]  
Aboulnaga Ashraf, 2001, P VLDB
[4]  
[Anonymous], 1980, CALCULUS COMMUNICATI, DOI DOI 10.1007/3-540-10235-3
[5]  
BOHANNON P, 2000, EFFICIENT INDEXING X
[6]   UnQL: a query language and algebra for semistructured data based on structural recursion [J].
Buneman, P ;
Fernandez, M ;
Suciu, D .
VLDB JOURNAL, 2000, 9 (01) :76-110
[7]  
Chamberlin D., 2000, XQUERY QUERY LANGUAG
[8]  
CLARK J, 1999, EXTENSIBLE STYLESHEE
[9]  
Clark J., 1999, XML PATH LANGUAGE XP
[10]  
COOPER B, 2001, P VLDB