Evaluating the Scaling of Graph-Algorithms for Big Data using GraphX

被引:44
作者
Andersen, Jakob Smedegaard [1 ]
Zukunft, Olaf [1 ]
机构
[1] HAW Hamburg, Dept Comp Sci, Hamburg, Germany
来源
PROCEEDINGS 2016 2ND INTERNATIONAL CONFERENCE ON OPEN AND BIG DATA - OBD 2016 | 2016年
关键词
GraphX; Graph Processing; Semi-Clustering; Collaborative Filtering; Parallel Computing;
D O I
10.1109/OBD.2016.8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Graph processing has achieved a lot of attention in different big data scenarios. In this paper, we present the design, implementation, and experimental evaluation of graph processing algorithms in two different application areas. First, we use semi-clustering as an example of an algorithm typically used social network analysis. Then, we examine an algorithm for collaborative filtering as typically used in E-Commerce scenarios. For both algorithms, we make use of Apache GraphX as an existing distributed graph processing framework based on Apache Spark. As GraphX does not include these two algorithms, we describe how to implement them using a combination of GraphX and the underlying Spark Core. Based on our implementation, we perform experiments to test the scalability of both the algorithms and the GraphX processing framework. The experiments show that different kinds of graph algorithms can be supported within the Spark framework. Furthermore, we show that for our test data the algorithms scale almost linearly when properly designed.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 19 条
[1]  
Andersen J. S., 2016, P 2016 IEEE BIG DAT
[2]  
[Anonymous], 2018, Graph theory
[3]  
[Anonymous], 2015, CORR
[4]  
[Anonymous], 2014, OSDI 14
[5]  
[Anonymous], 2012, Social Network Analysis
[6]  
[Anonymous], 2012, P 10 USENIX S OP SYS
[7]  
Avery C., 2011, P 2011 HAD SUMM SANT
[8]   Link prediction approach to collaborative filtering [J].
Huang, Z ;
Li, X ;
Chen, H .
PROCEEDINGS OF THE 5TH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, PROCEEDINGS, 2005, :141-142
[9]   PEGASUS: A Peta-Scale Graph Mining System - Implementation and Observations [J].
Kang, U. ;
Tsourakakis, Charalampos E. ;
Faloutsos, Christos .
2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, :229-238
[10]  
Karau H., 2015, Learning Spark: Lightning-Fast Big Data Analysis