XQueC: A query-conscious compressed XML database

被引:16
作者
Arion, Andrei [1 ]
Bonifati, Angela [2 ]
Manolescu, Ioana [1 ]
Pugliese, Andrea [3 ]
机构
[1] INRIA Futurs, LRI, PCRI, F-91893 Orsay, France
[2] Icar CNR, I-87036 Arcavacata Di Rende, CS, Italy
[3] Univ Calabria, DEIS, I-87036 Arcavacata Di Rende, CS, Italy
关键词
algorithms; design; XML databases; XML compression; XML data management; XQuery;
D O I
10.1145/1239971.1239974
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
XML compression has gained prominence recently because it counters the disadvantage of the verbose representation XML gives to data. In many applications, such as data exchange and data archiving, entirely compressing and decompressing a document is acceptable. In other applications, where queries must be run over compressed documents, compression may not be beneficial since the performance penalty in running the query processor over compressed data outweighs the data compression benefits. While balancing the interests of compression and query processing has received significant attention in the domain of relational databases, these results do not immediately translate to XML data. In this article, we address the problem of embedding compression into XML databases without degrading query performance. Since the setting is rather different from relational databases, the choice of compression granularity and compression algorithms must be revisited. Query execution in the compressed domain must also be rethought in the framework of XML query processing due to the richer structure of XML data. Indeed, a proper storage design for the compressed data plays a crucial role here. The XQueC system (XQuery Processor and Compressor) covers a wide set of XQuery queries in the compressed domain and relies on a workload-based cost model to perform the choices of the compression granules and of their corresponding compression algorithms. As a consequence, XQueC provides efficient query processing on compressed XML data. An extensive experimental assessment is presented, showing the effectiveness of the cost model, the compression ratios, and the query execution times.
引用
收藏
页数:35
相关论文
共 51 条
[1]   Structural joins: A primitive for efficient XML query pattern matching [J].
Al-Khalifa, S ;
Jagadish, HV ;
Koudas, N ;
Patel, JM ;
Srivastava, D ;
Wu, YQ .
18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2002, :141-152
[2]  
Amer-Yahia Sihem, 2000, P VLDB 2000, P329
[3]  
[Anonymous], P INT C 29 VLDB BERL
[4]   Order preserving string compression [J].
Antoshenkov, G ;
Lomet, D ;
Murray, J .
PROCEEDINGS OF THE TWELFTH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, 1996, :655-663
[5]   Dictionary-based order-preserving string compression [J].
Antoshenkov G. .
The VLDB Journal, 1997, 6 (1) :26-39
[6]  
*APA, 2004, AP CUST LOG FORM
[7]  
ARION A, 2004, P INT C EXT DAT TECH, P200
[8]  
ARION A, 2006, P INT C FLEX QUER AN, P13
[9]  
ARION A, 2006, P INT WORLD WID WEB, P1077
[10]  
*BER, 2003, BERK DB DAT STOR