Computing Large Connected Components Using Map Reduce In Logarithmic Rounds

被引:110
作者
Sharma, Amar [1 ]
Misra, Rajiv [1 ]
机构
[1] IIT, Dept Comp Sci & Engn, Patna, Bihar, India
来源
2016 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY), IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC), AND IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS) | 2016年
关键词
D O I
10.1109/BigDataSecurity-HPSC-IDS.2016.18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph structures are often used for representing data object and link between them in large datasets. Knowledge extraction from these data relies on finding the connected components within these graphs. Given a large graph G = (V,E), where V is the set of vertices and E is the set of edges, the problem is to find the connected components efficiently. The problem of finding the connected components is labeling each vertex with its graph component number. Recent works have been done to address this problem in MapReduce, but there always exists a trade-off between number of rounds and the communications per round. Let d be the diameter of the graph and n be the number of nodes in the largest component, all the prior implementations of this algorithm in MapReduce take linear, O(d), number of map reduce rounds and quadratic, O(n|V| +|E|), communications per round. The efficient algorithm proposed here finishes it in 2(log d) round and take (|V| + |E|) number of communications per round (where d = diameter of graph with V vertices and E edges, n = Number of nodes in largest component).
引用
收藏
页码:1 / 6
页数:6
相关论文
共 12 条
[1]  
Afrati F.N., 2011, EDBT, P1, DOI DOI 10.1145/1951365.1951367
[2]  
[Anonymous], 2004, OSDI 04
[3]  
Cohen J., 2009, COMPUTING SCI ENG, V11
[4]   PEGASUS: A Peta-Scale Graph Mining System - Implementation and Observations [J].
Kang, U. ;
Tsourakakis, Charalampos E. ;
Faloutsos, Christos .
2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, :229-238
[5]  
Karger David R., 1992, FAST CONNECTED COMPO, P373
[6]  
Karloff H, 2010, PROC APPL MATH, V135, P938
[7]   Google's MapReduce programming model -: Revisited [J].
Laemmel, Ralf .
SCIENCE OF COMPUTER PROGRAMMING, 2008, 70 (01) :1-30
[8]  
Papadimitriou Spiros, IEEE INT C DAT MIN
[9]   MapReduce in MPI for Large-scale graph algorithms [J].
Plimpton, Steven J. ;
Devine, Karen D. .
PARALLEL COMPUTING, 2011, 37 (09) :610-632
[10]  
Rastogi V., 2012, CORR