A Compiler for Throughput Optimization of Graph Algorithms on GPUs

被引：0

作者：

Pai, Sreepathi ^{[1
]}

Pingali, Keshav ^{[1
]}

机构：

[1] Univ Texas Austin, Austin, TX 78712 USA

来源：

ACM SIGPLAN NOTICES | 2016年 / 51卷 / 10期

基金：

美国国家科学基金会;

关键词：

Graph applications; amorphous data-parallelism; GPUs; compilers; optimization; throughput;

D O I：

10.1145/2983990.2984015

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Writing high-performance GPU implementations of graph algorithms can be challenging. In this paper, we argue that three optimizations called throughput optimizations are key to high-performance for this application class. These optimizations describe a large implementation space making it unrealistic for programmers to implement them by hand. To address this problem, we have implemented these optimizations in a compiler that produces CUDA code from an intermediate-level program representation called IrGL. Compared to state-of-the-art handwritten CUDA implementations of eight graph applications, code generated by the IrGL compiler is up to 5.95x times faster (median 1.4x) for five applications and never more than 30% slower for the others. Throughput optimizations contribute an improvement up to 4.16x (median 1.4x) to the performance of unoptimized IrGL code.

引用

页码：1 / 19

页数：19

共 67 条

[1]

Alglave J, 2015, ACM SIGPLAN NOTICES, V50, P577, DOI [10.1145/2694344.2694391, 10.1145/2775054.2694391]

[2]

[Anonymous], 2011, TECHNICAL REPORT

[3]

[Anonymous], 2014, LONESTARGPU 2 0 BENC

[4]

[Anonymous], THESIS

[5]

[Anonymous], 2 INT WORKSH GRAPH D

[6]

[Anonymous], NVIDIAS NEXT GEN CUD

[7]

[Anonymous], 2015, CUDA C PROGR GUID 7

[8]

Baghdadi R., 2012, WOLFHPC 2012

[9] PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming [J].

Baghdadi, Riyadh ;

Beaugnon, Ulysse ;

Cohen, Albert ;

Grosser, Tobias ;

Kruse, Michael ;

Reddy, Chandan ;

Verdoolaege, Sven ;

Absar, Javed ;

van Haastregt, Sven ;

Kravets, Alexey ;

Lokhmotov, Anton ;

Betts, Adam ;

Donaldson, Alastair F. ;

Ketema, Jeroen ;

David, Robert ;

Hajiyev, Elnar .

2015 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION (PACT), 2015, :138-149

[10]

Bardsley E, 2014, LECT NOTES COMPUT SC, V8559, P226, DOI 10.1007/978-3-319-08867-9_15

← 1 2 3 4 5 6 7 →