Parallel Sub-Structuring Methods for solving Sparse Linear Systems on a cluster of GPU

被引:4
作者
Ahamed, Abal-Kassim Cheik [1 ]
Magoules, Frederic [1 ]
机构
[1] Ecole Cent Paris, CUDA Res Ctr, Paris, France
来源
2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS) | 2014年
关键词
Sub-structuring method; Linear algebra; Conjugate Gradient; Parallel and distributed computing; Graphics Processing Unit; GPU Computing; CUDA; Finite element; NONOVERLAPPING SCHWARZ METHODS; ABSORBING BOUNDARY-CONDITIONS; DOMAIN DECOMPOSITION METHODS; TRANSMISSION CONDITIONS; INTERFACE CONDITIONS; OPTIMAL CONVERGENCE; OVERLAP;
D O I
10.1109/HPCC.2014.24
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The main objective of this work consists in analyzing sub-structuring method for the parallel solution of sparse linear systems with matrices arising from the discretization of partial differential equations such as finite element, finite volume and finite difference. With the success encountered by the general-purpose processing on graphics processing units (GPGPU), we develop an hybrid multiGPUs and CPUs sub-structuring algorithm. GPU computing, with CUDA, is used to accelerate the operations performed on each processor. Numerical experiments have been performed on a set of matrices arising from engineering problems. We compare C+MPI implementation on classical CPU cluster with C+MPI+CUDA on a cluster of GPU. The performance comparison shows a speed-up for the sub-structuring method up to 19 times in double precision by using CUDA.
引用
收藏
页码:121 / 128
页数:8
相关论文
共 49 条
[1]  
AHAMED AKC, 2013, DISTR COMP APPL BUS, P16, DOI DOI 10.1109/DCABES.2013.10
[2]  
[Anonymous], LECT NOTES COMPUTER
[3]  
[Anonymous], 2004, COMPUTATIONAL MATH
[4]  
Bahi JM, 2011, SIMUL SERIES, V43, P12
[5]  
Bakhoda A, 2009, INT SYM PERFORM ANAL, P163, DOI 10.1109/ISPASS.2009.4919648
[6]  
Bell N, 2009, STUDENTS GUIDE TO THE MA TESOL, P1
[7]  
Bell N., 2012, CUSP: Generic parallel algorithms for sparse matrix and graph computations
[8]  
Bell N., 2008, Efficient sparse matrix-vector multiplication on CUDA
[9]   Sparse matrix solvers on the GPU:: Conjugate gradients and multigrid [J].
Bolz, J ;
Farmer, I ;
Grinspun, E ;
Schröder, P .
ACM TRANSACTIONS ON GRAPHICS, 2003, 22 (03) :917-924
[10]  
Brlaz D., 1979, COMMUN ACM, V22, P251256