SIMD-X: Programming and Processing of Graph Algorithms on GPUs

被引：0

作者：

Liu, Hang ^{[1
]}

Huang, H. Howie ^{[2
]}

机构：

[1] Univ Massachusetts, Lowell, MA 01854 USA

[2] George Washington Univ, Washington, DC 20052 USA

来源：

PROCEEDINGS OF THE 2019 USENIX ANNUAL TECHNICAL CONFERENCE | 2019年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

With high computation power and memory bandwidth, graphics processing units (GPUs) lend themselves to accelerate data-intensive analytics, especially when such applications fit the single instruction multiple data (SIMD) model. However, graph algorithms such as breadth-first search and k-core, often fail to take full advantage of GPUs, due to irregularity in memory access and control flow. To address this challenge, we have developed SIMD-X, for programming and processing of single instruction multiple, complex, data on GPUs. Specifically, the new Active-Compute-Combine (ACC) model not only provides ease of programming to programmers, but more importantly creates opportunities for system-level optimizations. To this end, SIMD-X utilizes just-in-time task management which filters out inactive vertices at runtime and intelligently maps various tasks to different amount of GPU cores in pursuit of workload balancing. In addition, SIMD-X leverages push-pull based kernel fusion that, with the help of a new deadlock-free global barrier, reduces a large number of computation kernels to very few. Using SIMD-X, a user can program a graph algorithm in tens of lines of code, while achieving 3x, 6x, 24x, 3x speedup over Gunrock, Galois, CuSha, and Ligra, respectively.

引用

页码：411 / 427

页数：17

共 88 条

[1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2] Ai LY, 2017, 2017 USENIX ANNUAL TECHNICAL CONFERENCE (USENIX ATC '17), P125
[3] [Anonymous], 2004, SDM
[4] [Anonymous], P C HIGH PERF GRAPH
[5] [Anonymous], 2012, PPOPP
[6] [Anonymous], 12 USENIX S OP SYST
[7] [Anonymous], P INT C HIGH PERF CO
[8] [Anonymous], 2013, PPOPP
[9] [Anonymous], IEEE T PARALLEL DIST
[10] [Anonymous], P 17 INT C ARCH SUP

← 1 2 3 4 5 6 7 8 9 →