Balanced Graph Edge Partition

被引:96
作者
Bourse, Florian [1 ]
Lelarge, Marc [2 ]
Vojnovic, Milan [3 ]
机构
[1] ENS, Paris, France
[2] INRIA ENS, Paris, France
[3] Microsoft Res, Cambridge, England
来源
PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14) | 2014年
关键词
Graph Edge Partition; Distributed Massive Computations; Approximation Algorithms; Streaming Heuristics; APPROXIMATION; FRAMEWORK;
D O I
10.1145/2623330.2623660
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Balanced edge partition has emerged as a new approach to partition an input graph data for the purpose of scaling out parallel computations, which is of interest for several modern data analytics computation platforms, including platforms for iterative computations, machine learning problems, and graph databases. This new approach stands in a stark contrast to the traditional approach of balanced vertex partition, where for given number of partitions, the problem is to minimize the number of edges cut subject to balancing the vertex cardinality of partitions. In this paper, we first characterize the expected costs of vertex and edge partitions with and without aggregation of messages, for the commonly deployed policy of placing a vertex or an edge uniformly at random to one of the partitions. We then obtain the first approximation algorithms for the balanced edge-partition problem which for the case of no aggregation matches the best known approximation ratio for the balanced vertex-partition problem, and show that this remains to hold for the case with aggregation up to factor that is equal to the maximum in-degree of a vertex. We report results of an extensive empirical evaluation on a set of real-world graphs, which quantifies the benefits of edgevs. vertex-partition, and demonstrates efficiency of natural greedy online assignments for the balanced edge-partition problem with and with no aggregation.
引用
收藏
页码:1456 / 1465
页数:10
相关论文
共 39 条
[1]  
[Anonymous], 2013, P 8 ACM EUR C COMP S, DOI DOI 10.1145/2465351.2465369
[2]  
[Anonymous], P 2010 ACM SIGMOD IN, DOI [DOI 10.1145/1807167.1807184, 10.1145/1807167.1807184]
[3]  
[Anonymous], 2012, P 18 ACM SIGKDD INT, DOI [10.1145/2339530.2339722, DOI 10.1145/2339530.2339722]
[4]  
[Anonymous], 2005, The Parallel BGL: A generic library for distributed graph computations
[5]  
[Anonymous], 2004, P 16 ANN ACM S PAR A, DOI DOI 10.1145/1007912.1007931
[6]  
[Anonymous], 2010, Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI)
[7]  
[Anonymous], 2014, T A S F GIRAPH
[8]  
[Anonymous], 2006, P 20 INT C PAR DISTR
[9]  
[Anonymous], 2004, YRL2004036
[10]  
[Anonymous], 2012, P 10 USENIX S OP SYS