OpenMP to GPGPU: A Compiler Framework for Automatic Translation and Optimization

被引:169
作者
Lee, Seyong [1 ]
Min, Seung-Jai [1 ]
Eigenmann, Rudolf [1 ]
机构
[1] Purdue Univ, Sch ECE, W Lafayette, IN 47907 USA
基金
美国国家科学基金会;
关键词
Algorithms; Design; Performance; OpenMP; GPU; CUDA; Automatic Translation; Compiler Optimization; PROGRAMS;
D O I
10.1145/1594835.1504194
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
GPGPUs have recently emerged as powerful vehicles for general-purpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from NVIDIA offers improved programmability for general computing, programming GPGPUs is still complex and error-prone. This paper presents a compiler framework for automatic source-to-source translation of standard OpenMP applications into CUDA-based GPGPU applications. The goal of this translation is to further improve programmability and make existing OpenMP applications amenable to execution on GPGPUs. In this paper, we have identified several key transformation techniques, which enable efficient GPU global memory access, to achieve high performance. Experimental results from two important kernels (JACOBI and SPMUL) and two NAS OpenMP Parallel Benchmarks (EP and CG) show that the described translator and compile-time optimizations work well on both regular and irregular applications, leading to performance improvements of up to 50X over the unoptimized translation (up to 328X over serial on a CPU).
引用
收藏
页码:101 / 110
页数:10
相关论文
共 18 条
  • [1] AUTOMATIC TRANSLATION OF FORTRAN PROGRAMS TO VECTOR FORM
    ALLEN, R
    KENNEDY, K
    [J]. ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1987, 9 (04): : 491 - 542
  • [2] [Anonymous], INT WORKSH LANG COMP
  • [3] [Anonymous], IEEE INT PAR DISTR P
  • [4] [Anonymous], ACM INT C SUP ICS
  • [5] [Anonymous], INT S COD GEN OPT CG
  • [6] BASUMALLIK A, 2005, ACM INT C SUP ICS, P189
  • [7] Davis Tim., U FLORIDA SPARSE MAT
  • [8] GOVINDARAJU NK, 2006, INT C HIGH PERF COMP
  • [9] LEE SI, 2003, INT WORKSH LANG COMP
  • [10] LEVINE D, 1991, PARALLEL COMPUTING, V17