A Compiler for Throughput Optimization of Graph Algorithms on GPUs

被引:0
作者
Pai, Sreepathi [1 ]
Pingali, Keshav [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
Graph applications; amorphous data-parallelism; GPUs; compilers; optimization; throughput;
D O I
10.1145/2983990.2984015
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Writing high-performance GPU implementations of graph algorithms can be challenging. In this paper, we argue that three optimizations called throughput optimizations are key to high-performance for this application class. These optimizations describe a large implementation space making it unrealistic for programmers to implement them by hand. To address this problem, we have implemented these optimizations in a compiler that produces CUDA code from an intermediate-level program representation called IrGL. Compared to state-of-the-art handwritten CUDA implementations of eight graph applications, code generated by the IrGL compiler is up to 5.95x times faster (median 1.4x) for five applications and never more than 30% slower for the others. Throughput optimizations contribute an improvement up to 4.16x (median 1.4x) to the performance of unoptimized IrGL code.
引用
收藏
页码:1 / 19
页数:19
相关论文
共 67 条
[1]  
Alglave J, 2015, ACM SIGPLAN NOTICES, V50, P577, DOI [10.1145/2694344.2694391, 10.1145/2775054.2694391]
[2]  
[Anonymous], 2011, TECHNICAL REPORT
[3]  
[Anonymous], 2014, LONESTARGPU 2 0 BENC
[4]  
[Anonymous], THESIS
[5]  
[Anonymous], 2 INT WORKSH GRAPH D
[6]  
[Anonymous], NVIDIAS NEXT GEN CUD
[7]  
[Anonymous], 2015, CUDA C PROGR GUID 7
[8]  
Baghdadi R., 2012, WOLFHPC 2012
[9]   PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming [J].
Baghdadi, Riyadh ;
Beaugnon, Ulysse ;
Cohen, Albert ;
Grosser, Tobias ;
Kruse, Michael ;
Reddy, Chandan ;
Verdoolaege, Sven ;
Absar, Javed ;
van Haastregt, Sven ;
Kravets, Alexey ;
Lokhmotov, Anton ;
Betts, Adam ;
Donaldson, Alastair F. ;
Ketema, Jeroen ;
David, Robert ;
Hajiyev, Elnar .
2015 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION (PACT), 2015, :138-149
[10]  
Bardsley E, 2014, LECT NOTES COMPUT SC, V8559, P226, DOI 10.1007/978-3-319-08867-9_15