Distributed Graph Path Queries using Spark

被引:3
作者
Balaji, Janani [1 ]
Sunderraman, Rajshekhar [1 ]
机构
[1] Georgia State Univ, Dept Comp Sci, Atlanta, GA 30303 USA
来源
PROCEEDINGS 2016 IEEE 40TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE WORKSHOPS (COMPSAC), VOL 2 | 2016年
关键词
D O I
10.1109/COMPSAC.2016.98
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Graphs are increasingly being used as the data structure of choice to represent interactions between heterogeneous entities. Graph path querying is a primary operation in the network graph space, for both real time querying and inferential analysis. The rate and volume of interconnected data being generated warrants efficient distributed solutions to manage and query network graphs in a scalable fashion. Existing distributed solutions have proposed several optimization techniques, including intelligent joins and partial evaluations to process path queries. However, the former relies on comprehensive indices while the latter involves extensive driver-side processing to combine the partial results, neither of which is efficient for processing large graphs. In this paper, we propose a novel distributed graph path query processing system using the Apache Spark framework.
引用
收藏
页码:326 / 331
页数:6
相关论文
共 15 条
[1]  
[Anonymous], 2014, OSDI 14
[2]  
[Anonymous], 2008, CORR
[3]  
Chen R., 2010, Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD), P1123
[4]  
Fan WF, 2011, PROC INT CONF DATA, P39, DOI 10.1109/ICDE.2011.5767858
[5]  
Hose K, 2013, I C DATA ENGIN WORKS, P1, DOI 10.1109/ICDEW.2013.6547414
[6]  
Huang JW, 2011, PROC VLDB ENDOW, V4, P1123
[7]  
Kang U., 2011, P 17 ACM SIGKDD INT, P1091
[8]  
Leskovec J., 2014, SNAP DATASETS STANFO
[9]  
Low Y., 2012, CORR
[10]  
Malewicz G., 2010, P 2010 ACM SIGMOD IN, P135, DOI [DOI 10.1145/1807167.1807184, 10.1145/1807167.1807184]