GraphH: High Performance Big Graph Analytics in Small Clusters

被引:4
作者
Sun, Peng [1 ]
Wen, Yonggang [1 ]
Ta Nguyen Binh Duong [1 ]
Xiao, Xiaokui [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
来源
2017 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER) | 2017年
关键词
Graph Processing; Distributed Computing System; Network; SYSTEMS;
D O I
10.1109/CLUSTER.2017.51
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
It is common for real-world applications to analyze big graphs using distributed graph processing systems. Popular in-memory systems require an enormous amount of resources to handle big graphs. While several out-of-core approaches have been proposed for processing big graphs on disk, the high disk I/O overhead could significantly reduce performance. In this paper, we propose GraphH to enable high-performance big graph analytics in small clusters. Specifically, we design a two-stage graph partition scheme to evenly divide the input graph into partitions, and propose a GAB (Gather-Apply-Broadcast) computation model to make each worker process a partition in memory at a time. We use an edge cache mechanism to reduce the disk I/O overhead, and design a hybrid strategy to improve the communication performance. GraphH can efficiently process big graphs in small clusters or even a single commodity server. Extensive evaluations have shown that GraphH could be up to 7.8x faster compared to popular in-memory systems, such as Pregel+ and PowerGraph when processing generic graphs, and more than 100x faster than recently proposed out-of-core systems, such as GraphD and Chaos when processing big graphs.
引用
收藏
页码:256 / 266
页数:11
相关论文
共 25 条
  • [1] [Anonymous], 2014, OSDI 14
  • [2] [Anonymous], 2013, PROC 25 INT C SCI ST
  • [3] Braam PeterJ., 2002, LUSTRE SCALABLE HIGH
  • [4] Strategies to prevent and reverse liver fibrosis in humans and laboratory animals
    Chen, Rong-Jane
    Wu, Hsiang-Hua
    Wang, Ying-Jan
    [J]. ARCHIVES OF TOXICOLOGY, 2015, 89 (10) : 1727 - 1750
  • [5] Cheng JF, 2015, PROC INT CONF DATA, P1131, DOI 10.1109/ICDE.2015.7113362
  • [6] One Trillion Edges: Graph Processing at Facebook-Scale
    Ching, Avery
    Edunov, Sergey
    Kabiljo, Maja
    Logothetis, Dionysios
    Muthukrishnan, Sambavi
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (12): : 1804 - 1815
  • [7] Gonzalez J. E., 2012, OSDI, V12, P2
  • [8] Hoque Imranul., 2013, P 1 ACM SIGOPS C TIM, P9, DOI DOI 10.1145/2524211.2524218
  • [9] Toward Scalable Systems for Big Data Analytics: A Technology Tutorial
    Hu, Han
    Wen, Yonggang
    Chua, Tat-Seng
    Li, Xuelong
    [J]. IEEE ACCESS, 2014, 2 : 652 - 687
  • [10] LDBC Graphalytics: A Benchmark for Large-Scale Graph Analysis on Parallel and Distributed Platforms
    Iosup, Alexandru
    Hegeman, Tim
    Ngai, Wing Lung
    Heldens, Stijn
    Prat-Perez, Arnau
    Manhardt, Thomas
    Chafi, Hassan
    Capota, Mihai
    Sundaram, Narayanan
    Anderson, Michael
    Tanase, Ilie Gabriel
    Xia, Yinglong
    Nai, Lifeng
    Boncz, Peter
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (13): : 1317 - 1328