Towards efficient SPARQL query processing on RDF data

被引:17
作者
Liu C. [1 ]
Wang H. [1 ]
Yu Y. [1 ]
Xu L. [2 ]
机构
[1] Department of Computer Science and Engineering, Shanghai Jiao Tong University
[2] IBM China Research Lab
关键词
optimization; resource description framework (RDF) query engine; SPARQL;
D O I
10.1016/S1007-0214(10)70108-5
中图分类号
学科分类号
摘要
Efficient support for querying large-scale resource description framework (RDF) triples plays an important role in semantic web data management. This paper presents an efficient RDF query engine to evaluate SPARQL queries, where the inverted index structure is employed for indexing the RDF triples. A set of operators on the inverted index was developed for query optimization and evaluation. Then a main-tree-shaped optimization algorithm was developed that transforms a SPARQL query graph into the optimal query plan by effectively reducing the search space to determine the optimal joining order. The optimization collects a set of RDF statistics for estimating the execution cost of the query plan. Finally the optimal query plan is evaluated using the defined operators for answering the given SPARQL query. Extensive tests were conducted on both synthetic and real datasets containing up to 100 million triples to evaluate this approach with the results showing that this approach can answer most queries within 1s and is extremely efficient and scalable in comparison with previous best state-of-the-art RDF stores.
引用
收藏
页码:613 / 622
页数:9
相关论文
共 21 条
[1]  
DBpedia, (2010)
[2]  
Bizer C., Heath T., Idehen K., Et al., Linked data on the web, Proc. of World Wide Web, pp. 1265-1266, (2008)
[3]  
SPARQL Query Language for RDF, (2008)
[4]  
Wilkinson K., Jena property table implementation, Proc. of SSWS, (2006)
[5]  
Ma L., Wang C., Lu J., Et al., Effective and efficient semantic web data management on DB2, Proc. of SIGMOD: International Conference on Management of Data, (2008)
[6]  
Abadi D.J., Marcus A., Madden S.R., Et al., Scalable semantic web data management using vertical partitioning, Proc. of Very Large Database, (2007)
[7]  
Weiss C., Karras P., Bernstein A., Hexastore: Sextuple indexing for semantic web data management, Proc. of Very Large Database, (2008)
[8]  
Harth A., Umbrich J., Hogan A., Et al., Yars2: A federated repository for querying graph structured data from the web, Proc. of International Semantic Web Conference, (2007)
[9]  
Zhang L., Liu Q., Zhang J., Et al., Semplore: An IR approach to scalable hybrid query of semantic web data, Proc. of International Semantic Web Conference, (2007)
[10]  
Stocker M., Seaborne A., Bernstein A., Et al., SPARQL basic graph pattern optimization using selectivity estimation, Proc. of World Wide Web, (2008)