Scalable Multi-Query Optimization for SPARQL

被引:61
作者
Le, Wangchao [1 ]
Kementsietsidis, Anastasios [2 ]
Duan, Songyun [2 ]
Li, Feifei [1 ]
机构
[1] Univ Utah, Sch Comp, Salt Lake City, UT 84112 USA
[2] IBM TJ Watson Res Ctr, Hawthorne, NY USA
来源
2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE) | 2012年
关键词
D O I
10.1109/ICDE.2012.37
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper revisits the classical problem of multi-query optimization in the context of RDF/SPARQL. We show that the techniques developed for relational and semi-structured data/query languages are hard, if not impossible, to be extended to account for RDF data model and graph query patterns expressed in SPARQL. In light of the NP-hardness of the multi-query optimization for SPARQL, we propose heuristic algorithms that partition the input batch of queries into groups such that each group of queries can be optimized together. An essential component of the optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. Since our optimization techniques do not make any assumption about the underlying SPARQL query engine, they have the advantage of being portable across different RDF stores. The extensive experimental studies, performed on three popular RDF stores, show that the proposed techniques are effective, efficient and scalable.
引用
收藏
页码:666 / 677
页数:12
相关论文
共 39 条
[1]  
Abadi D. J., 2007, VLDB
[2]  
Angles R., 2008, ISWC
[3]  
[Anonymous], 1986, Graph Theory
[4]  
[Anonymous], WWW
[5]  
Atre M., 2010, WWW
[6]  
Bizer C., 2009, INT J SEMANTIC WEB I
[7]  
BRUNO N, 2003, ICDE
[8]  
Dalvi N.N., 2001, PODS
[9]  
Diwan A. A., 2006, COMAD
[10]  
Duan S., 2011, SIGMOD