Performance analysis and optimization of three-dimensional FDTD on GPU using roofline model

被引:47
作者
Kim, Ki-Hwan [1 ,2 ]
Kim, KyoungHo [1 ]
Park, Q-Han [1 ]
机构
[1] Korea Univ, Dept Phys, Seoul 136701, South Korea
[2] KISTI, Supercomp Ctr, Taejon 305806, South Korea
关键词
FDTD; GPU; CUDA; Roofline;
D O I
10.1016/j.cpc.2011.01.025
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The Finite-Difference Time-Domain (FDTD) method is commonly used for electromagnetic field simulations. Recently, successful hardware-accelerations using Graphics Processing Unit (CPU) have been reported for the large-scale FDTD simulations. In this paper, we present a performance analysis of the three-dimensional (3D) FDTD on GPU using the roofline model. We find that theoretical predictions on maximum performance agrees well with the experimental results. We also suggest the suitable optimization methods for the best performance of FDTD on GPU. In particular, the optimized 3D FDTD program on GPU (NVIDIA Geforce GTX 480) is shown to be 64 times faster than the naively implemented program on CPU (Intel Core i7 2600). (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:1201 / 1207
页数:7
相关论文
共 10 条
[1]  
Adams S, 2007, PROCEEDINGS OF THE HPCMP USERS GROUP CONFERENCE 2007, P334
[2]  
[Anonymous], P INT C HIGH PERF CO
[3]  
[Anonymous], THESIS U CALIFORNIA
[4]   Programming video cards for computational electromagnetics applications [J].
Inman, MJ ;
Elsherbeni, AZ .
IEEE ANTENNAS AND PROPAGATION MAGAZINE, 2005, 47 (06) :71-78
[5]  
Krakiwsky S. E., 2004, 2004 IEEE MTT-S International Microwave Symposium Digest (IEEE Cat. No.04CH37535), P1033
[6]  
NVIDIA, 2010, NVID CUD BEST PRACT
[7]   How to Render FDTD Computations More Effective Using a Graphics Accelerator [J].
Sypek, Piotr ;
Dziekonski, Adam ;
Mrozowski, Michal .
IEEE TRANSACTIONS ON MAGNETICS, 2009, 45 (03) :1324-1327
[8]  
Taflove A., 2005, COMPUTATIONAL ELECTR
[9]   Roofline: An Insightful Visual Performance Model for Multicore Architectures [J].
Williams, Samuel ;
Waterman, Andrew ;
Patterson, David .
COMMUNICATIONS OF THE ACM, 2009, 52 (04) :65-76
[10]   CUDA Implementation of TEZ-FDTD Solution of Maxwell's Equations in Dispersive Media [J].
Zunoubi, Mohammad Reza ;
Payne, Jason ;
Roach, William P. .
IEEE ANTENNAS AND WIRELESS PROPAGATION LETTERS, 2010, 9 :756-759