Combining Vertex-Centric Graph Processing with SPARQL for Large-Scale RDF Data Analytics

被引:9
作者
Abdelaziz, Ibrahim [1 ]
Harbi, Razen [2 ]
Salihoglu, Semih [3 ]
Kalnis, Panos [1 ]
机构
[1] KAUST, Thuwal 23955, Saudi Arabia
[2] Saudi Aramco, Thuwal 23955, Saudi Arabia
[3] Univ Waterloo, Waterloo, ON N2L 3G1, Canada
关键词
RDF data; graph analytics; SPARQL; vertex-centric; QUERIES; SYSTEM;
D O I
10.1109/TPDS.2017.2720174
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Modern applications require sophisticated analytics on RDF graphs that combine structural queries with generic graph computations. Existing systems support either declarative SPARQL queries, or generic graph processing, but not both. We bridge the gap by introducing Spartex, a versatile framework for complex RDF analytics. Spartex extends SPARQL to combine seamlessly generic graph algorithms (e.g., PageRank, Shortest Paths, etc.) with SPARQL queries. Spartex builds on existing vertex-centric graph processing frameworks, such as Graphlab or Pregel. It implements a generic SPARQL operator as a vertex-centric program that interprets SPARQL queries and executes them efficiently using a built-in optimizer. In addition, any graph algorithm implemented in the underlying vertex-centric framework, can be executed in Spartex. We present various scenarios where our framework simplifies significantly the implementation of complex RDF data analytics programs. We demonstrate that Spartex scales to datasets with billions of edges, and show that our core SPARQL engine is at least as fast as the state-of-the-art specialized RDF engines. For complex analytical tasks that combine generic graph processing with SPARQL, Spartex is at least an order of magnitude faster than existing alternatives.
引用
收藏
页码:3374 / 3388
页数:15
相关论文
共 42 条
[1]  
[Anonymous], 2013, Proceedings ACM SIGMOD International Conference Management Data, DOI DOI 10.1145/2463676.2467799
[2]  
[Anonymous], 2013, P 25 INT C SCI STAT, DOI DOI 10.1145/2484838.2484843
[3]  
[Anonymous], P 2012 ACM SIGMOD IN
[4]  
[Anonymous], 2012, P 10 USENIX S OP SYS
[5]  
[Anonymous], P INT WORKSH SEM WEB
[6]  
Atre Medha., 2010, WWW, P41, DOI DOI 10.1145/1772690.1772696
[7]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[8]  
Fard Arash, 2013, 2013 IEEE International Conference on Big Data, P403, DOI 10.1109/BigData.2013.6691601
[9]  
Gallego M. A., 2011, EMPIRICAL STUDY REAL
[10]  
Gao J, 2014, PROC INT CONF DATA, P556, DOI 10.1109/ICDE.2014.6816681