An Adaptive Performance Modeling Tool for GPU Architectures

被引:80
作者
Baghsorkhi, Sara S. [1 ]
Delahaye, Matthieu [1 ]
Patel, Sanjay J. [1 ]
Gropp, William D. [1 ]
Hwu, Wen-mei W. [1 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
Design; Measurement; Performance; Analytical model; GPU; Parallel programming; Performance estimation;
D O I
10.1145/1837853.1693470
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information to an auto-tuning compiler and assist it in narrowing down the search to the more promising implementations. It can also be incorporated into a tool to help programmers better assess the performance bottlenecks in their code. We analyze each GPU kernel and identify how the kernel exercises major GPU microarchitecture features. To identify the performance bottlenecks accurately, we introduce an abstract interpretation of a GPU kernel, work flow graph, based on which we estimate the execution time of a GPU kernel. We validated our performance model on the NVIDIA GPUs using CUDA (Compute Unified Device Architecture). For this purpose, we used data parallel benchmarks that stress different GPU microarchitecture events such as uncoalesced memory accesses, scratch-pad memory bank conflicts, and control flow divergence, which must be accurately modeled but represent challenges to the analytical performance models. The proposed model captures full system complexity and shows high accuracy in predicting the performance trends of different optimized kernel implementations. We also describe our approach to extracting the performance model automatically from a kernel code.
引用
收藏
页码:105 / 114
页数:10
相关论文
共 21 条
[1]  
[Anonymous], 2007, NVIDIA CUDA PROGR GU, V1.1
[2]  
[Anonymous], ATI STREAM COMPUTING
[3]  
Baskaran M.M., 2008, OPTIMIZING SPARSE MA
[4]  
CLEMENT M, 1993, ACM IEEE C SUP NOV
[5]   EFFICIENTLY COMPUTING STATIC SINGLE ASSIGNMENT FORM AND THE CONTROL DEPENDENCE GRAPH [J].
CYTRON, R ;
FERRANTE, J ;
ROSEN, BK ;
WEGMAN, MN ;
ZADECK, FK .
ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1991, 13 (04) :451-490
[6]  
Davis Tim., U FLORIDA SPARSE MAT
[7]  
FATAHALIAN K, 2004, C GRAPH HARDW AUG
[8]   THE PROGRAM DEPENDENCE GRAPH AND ITS USE IN OPTIMIZATION [J].
FERRANTE, J ;
OTTENSTEIN, KJ ;
WARREN, JD .
ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1987, 9 (03) :319-349
[9]  
GOVINDARAJU NK, 2006, ACM IEEE C SUP NOV
[10]  
GOVINDARAJU NK, 2008, ACM IEEE C SUP NOV