Achieving Optimal Inter-Node Communication in Graph Partitioning Using Random Selection and Breadth-First Search

被引:0
作者
Srimanth Gadde
William Acosta
Jordan Ringenberg
Robert Green
Vijay Devabhaktuni
机构
[1] University of Toledo,EECS Department, College of Engineering
[2] Dish Network,Department of Computer Science
[3] The University of Findlay,Department of Computer Science
[4] Bowling Green State University,undefined
来源
International Journal of Parallel Programming | 2016年 / 44卷
关键词
Graph partitioning; Random selection; Breadth first search;
D O I
暂无
中图分类号
学科分类号
摘要
Processing large graph datasets represents an increasingly important area in computing research and applications. The size of many graph datasets has increased well beyond the processing capacity of a single computing node, thereby necessitating distributed approaches. As these datasets are processed over a distributed system of nodes, this leads to an inter-node communication cost problem that negatively affects system performance. Previously proposed algorithms implemented breadth-first search (BFS) for graph searching and focused on the execution, parallel performance and not the communication. In this paper a new methodology is proposed that combines BFS with random selection in order to partition large graph datasets and effectively minimize inter-node communication. The new method is discussed and applied to the single-source shortest path and PageRank algorithms using three graphs that are representative of real-world scenarios. Experimental results show that graph inter-node communication for canonical graphs representative of real-world data is improved up to 42 % in case of Powerlaw graph, up to 27 % in case of Random near K-regular graph (with low degree), and up to 7 % in case of Random near K-regular graph (with high degree).
引用
收藏
页码:772 / 800
页数:28
相关论文
共 20 条
  • [1] Bu Y(2010)Haloop: efficient iterative data processing on large clusters PVLDB 3 285-296
  • [2] Howe B(1984)Shortest-path algorithms: taxonomy and annotation Networks 14 275-323
  • [3] Balazinska M(1990)A bridging model for parallel computation Commun. ACM 33 103-111
  • [4] Ernst MD(2001)A random graph model for power law graphs Exp. Math. 10 53-66
  • [5] Narsingh D(1990)Uniform generation of random regular graphs of moderate degree J. Algorithm 11 52-67
  • [6] Chi-Yin P(2008)MapReduce: simplified data processing on large clusters Commun. ACM 51 107-113
  • [7] Valiant LG(2009)Computing breadth first search in large graph using hmetis partitioning Eur. J. Sci. Res. 29 215-221
  • [8] William A(2008)The promise of high-performance reconfigurable computing IEEE Comput. 41 69-76
  • [9] Fan C(1999)Use of the Mann–Whitney Stat. Med. 18 1387-1400
  • [10] Linyuan L(2011) test for clustered data Metodički Obzori 6 73-79