Chaos: Scale-out Graph Processing from Secondary Storage

被引:142
作者
Roy, Amitabha [1 ]
Bindschaedler, Laurent [2 ]
Malicevic, Jasmina [2 ]
Zwaenepoel, Willy [2 ]
机构
[1] Intel, Santa Clara, CA 95054 USA
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
来源
SOSP'15: PROCEEDINGS OF THE TWENTY-FIFTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES | 2015年
关键词
D O I
10.1145/2815400.2815408
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Chaos scales graph processing from secondary storage to multiple machines in a cluster. Earlier systems that process graphs from secondary storage are restricted to a single machine, and therefore limited by the bandwidth and capacity of the storage system on a single machine. Chaos is limited only by the aggregate bandwidth and capacity of all storage devices in the entire cluster. Chaos builds on the streaming partitions introduced by X-Stream in order to achieve sequential access to storage, but parallelizes the execution of streaming partitions. Chaos is novel in three ways. First, Chaos partitions for sequential storage access, rather than for locality and load balance, resulting in much lower pre-processing times. Second, Chaos distributes graph data uniformly randomly across the cluster and does not attempt to achieve locality, based on the observation that in a small cluster network bandwidth far outstrips storage bandwidth. Third, Chaos uses work stealing to allow multiple machines to work on a single partition, thereby achieving load balance at runtime. In terms of performance scaling, on 32 machines Chaos takes on average only 1.61 times longer to process a graph 32 times larger than on a single machine. In terms of capacity scaling, Chaos is capable of handling a graph with 1 trillion edges representing 16 TB of input data, a new milestone for graph processing capacity on a small commodity cluster.
引用
收藏
页码:410 / 424
页数:15
相关论文
共 28 条
[1]  
[Anonymous], 2014, P INT C SYST STOR
[2]  
[Anonymous], 2013, PROCEEDINGS OF THE I
[3]  
[Anonymous], 2010, SC 10, DOI DOI 10.1109/SC.2010.34
[4]  
BALAKRISHNAN M, 2012, P C NETW SYST DES IM
[5]   Scheduling multithreaded computations by work stealing [J].
Blumofe, RD ;
Leiserson, CE .
JOURNAL OF THE ACM, 1999, 46 (05) :720-748
[6]  
Chakrabarti D., 2004, P SIAM INT C DAT MIN
[7]  
Elnozahy E. N., 1992, Proceedings 11th Symposium on Reliable Distributed Systems (Cat. No.92CH3187-2), P39, DOI 10.1109/RELDIS.1992.235144
[8]  
Garey M. R., 1976, Theoretical Computer Science, V1, P237, DOI 10.1016/0304-3975(76)90059-1
[9]  
Gonzalez J.E., 2012, 10 USENIX S OP SYST, P17
[10]  
GREENBERG A, SIGCOMM COMPUT COMMU, V39, P51