A MEMORY EFFICIENT AND FAST SPARSE MATRIX VECTOR PRODUCT ON A GPU

被引:56
作者
Dziekonski, A. [1 ]
Lamecki, A. [1 ]
Mrozowski, M. [1 ]
机构
[1] Gdansk Univ Technol GUT, Fac Elect Telecommun & Informat ETI, WiComm Ctr Excellence, PL-80233 Gdansk, Poland
来源
PROGRESS IN ELECTROMAGNETICS RESEARCH-PIER | 2011年 / 116卷
关键词
FINITE-ELEMENT-METHOD; FDTD METHOD; SCATTERING; ALGORITHM; UNITS;
D O I
10.2528/PIER11031607
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a new sparse matrix storage format which allows an efficient implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit (GPU). Unlike previous formats it has both low memory footprint and good throughput. The new format, which we call Sliced ELLR-T has been designed specifically for accelerating the iterative solution of a large sparse and complex-valued system of linear equations arising in computational electromagnetics. Numerical tests have shown that the performance of the new implementation reaches 69 GFLOPS in complex single precision arithmetic. Compared to the optimized six core Central Processing Unit (CPU) (Intel Xeon 5680) this performance implies a speedup by a factor of six. In terms of speed the new format is as fast as the best format published so far and at the same time it does not introduce redundant zero elements which have to be stored to ensure fast memory access. Compared to previously published solutions, significantly larger problems can be handled using low cost commodity GPUs with limited amount of on-board memory.
引用
收藏
页码:49 / 63
页数:15
相关论文
共 31 条
  • [1] Adams S., 2007, HIGH PERF COMP MOD P
  • [2] [Anonymous], 2008, NVIDIA Technical Report NVR-2008-004
  • [3] [Anonymous], 2016, Programming massively parallel processors: a hands-on approach
  • [4] [Anonymous], 2011, CUDA EXAMPLE INTRO G
  • [5] Finite-difference analysis of a loaded hemispherical resonator
    Cwikla, A
    Mrozowski, M
    Rewienski, M
    [J]. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, 2003, 51 (05) : 1506 - 1511
  • [6] DZIEKONSKI A, 2011, IEEE MICROWAVE WIREL, V21
  • [7] FAST RCS PREDICTION USING MULTIRESOLUTION SHOOTING AND BOUNCING RAY METHOD ON THE GPU
    Gao, P. C.
    Tao, Y. B.
    Lin, H.
    [J]. PROGRESS IN ELECTROMAGNETICS RESEARCH-PIER, 2010, 107 : 187 - 202
  • [8] A finite element method for the analysis of radiation and scattering of electromagnetic waves on complex environments
    García-Castillo, LE
    Gómez-Revuelto, I
    de Adana, FS
    Salazar-Palma, M
    [J]. COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2005, 194 (2-5) : 637 - 655
  • [9] A two-dimensional self-adaptive hp finite element method for the analysis of open region problems in electromagnetics
    Gomez-Revuelto, I.
    Garcia-Castillo, L. E.
    Pardo, D.
    Demkowicz, L. F.
    [J]. IEEE TRANSACTIONS ON MAGNETICS, 2007, 43 (04) : 1337 - 1340
  • [10] DESIGN AND ANALYSIS OF A MAGNETIC-GEARED ELECTRONIC-CONTINUOUSLY VARIABLE TRANSMISSION SYSTEM USING FINITE ELEMENT METHOD
    Jian, L.
    Chau, K. T.
    [J]. PROGRESS IN ELECTROMAGNETICS RESEARCH-PIER, 2010, 107 : 47 - 61