Storing XML (with XSD) in SQL databases: Interplay of logical and physical designs

被引:8
作者
Chaudhuri, S
Chen, ZY
Shim, K
Wu, YQ
机构
[1] Microsoft Corp, Res, Redmond, WA 98052 USA
[2] Univ Maryland Baltimore Cty, Dept Informat Syst, Baltimore, MD 21250 USA
[3] Seoul Natl Univ, Sch Elect Engn & Comp Sci, Seoul 151, South Korea
[4] Indiana Univ, Sch Informat, Bloomington, IN 47405 USA
关键词
XML; physical design; relational databases;
D O I
10.1109/TKDE.2005.204
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Much of business XML data has accompanying XSD specifications. In many scenarios, "shredding" such XML data into a relational storage is a popular paradigm. Optimizing evaluation of XPath queries over such XML data requires paying careful attention to both the logical and physical designs of the relational database where XML data is shredded. None of the existing solutions has taken into account physical design of the generated relational database. In this paper, we study the interplay of logical and physical design and conclude that 1) solving them independently leads to suboptimal performance and 2) there is substantial overlap between logical and physical designs: some well-known logical design transformations generate the same mappings as physical design. Furthermore, existing search algorithms are inefficient to search the extremely large space of logical and physical design combinations. We propose a search algorithm that carefully avoids searching duplicated mappings and utilizes the workload information to further prune the search space. Experimental results confirm the effectiveness of our approach.
引用
收藏
页码:1595 / 1609
页数:15
相关论文
共 25 条
[1]  
AGRAWAL S, 2000, P VER LARG DAT BAS C
[2]  
Agrawal Sanjay, 2004, ACM SIGMOD INT C MAN, DOI [10.1145/1007568.1007609, DOI 10.1145/1007568.1007609]
[3]  
[Anonymous], 2002, Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data
[4]  
BANERJEE S, 2000, P INT C DAT ENG
[5]  
BOHANNON P, 2002, P INT C DAT ENG
[6]  
BRUNO N, 2002, P ACM SIGMOD
[7]  
CHAUDHURI S, 1997, P VER LARG DAT BAS C
[8]  
CHENG JM, 1999, P INT C DAT ENG
[9]  
CHRISTOPHIDES V, 1994, P ACM SIGMOD C
[10]  
*DBLP, 2005, XML REC