Fast Parallel Algorithms for Counting and Listing Triangles in Big Graphs

被引:23
作者
Arifuzzaman, Shaikh [1 ]
Khan, Maleq [2 ]
Marathe, Madhav [3 ]
机构
[1] Univ New Orleans, Comp Sci Dept, 2000 Lakeshore Dr,Math 349, New Orleans, LA 70122 USA
[2] Texas A&M Univ Kingsville, Dept Elect Engn & Comp Sci, 700 Univ Blvd, Kingsville, TX 78363 USA
[3] Univ Virginia, Dept Comp Sci, 85 Engineers Way, Charlottesville, VA 22904 USA
关键词
Triangle-counting; clustering-coefficient; massive networks; parallel algorithms; social networks; graph mining;
D O I
10.1145/3365676
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big graphs (networks) arising in numerous application areas pose significant challenges for graph analysts as these graphs grow to billions of nodes and edges and are prohibitively large to fit in the main memory. Finding the number of triangles in a graph is an important problem in the mining and analysis of graphs. In this article, we present two efficient MPI-based distributed memory parallel algorithms for counting triangles in big graphs. The first algorithm employs overlapping partitioning and efficient load balancing schemes to provide a very fast parallel algorithm. The algorithm scales well to networks with billions of nodes and can compute the exact number of triangles in a network with 10 billion edges in 16 minutes. The second algorithm divides the network into non-overlapping partitions leading to a space-efficient algorithm. Our results on both artificial and real-world networks demonstrate a significant space saving with this algorithm. We also present a novel approach that reduces communication cost drastically leading the algorithm to both a space- and runtime-efficient algorithm. Further, we demonstrate how our algorithms can be used to list all triangles in a graph and compute clustering coefficients of nodes. Our algorithm can also be adapted to a parallel approximation algorithm using an edge sparsification method.
引用
收藏
页数:34
相关论文
共 58 条
[1]   On Sampling from Massive Graph Streams [J].
Ahmed, Nesreen K. ;
Duffield, Nick ;
Willke, Theodore L. ;
Rossi, Ryan A. .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (11) :1430-1441
[2]  
Ahmed NesreenK., 2016, IEEE International Conference on Big Data, P1
[3]   Finding and counting given length cycles [J].
Alon, N ;
Yuster, R ;
Zwick, U .
ALGORITHMICA, 1997, 17 (03) :209-223
[4]  
[Anonymous], TKDD
[5]  
[Anonymous], P ACM INT C INF KNOW
[6]  
[Anonymous], P 2015 IEEE 56 ANN S
[7]  
[Anonymous], PATOHV3 2
[8]  
[Anonymous], 2012, INT C HIGH PERF COMP
[9]  
[Anonymous], P ACM INT C INF KNOW
[10]  
[Anonymous], 2016, KDD 16 P 22 ACM SIGK, DOI DOI 10.1145/2939672.2939757