A Parallel Preconditioned Conjugate Gradient Solver for the Poisson Problem on a Multi-GPU Platform

被引:69
作者
Ament, M. [1 ]
Knittel, G. [2 ]
Weiskopf, D. [1 ]
Strasser, W. [2 ]
机构
[1] Univ Stuttgart, VISUS Visualizat Res Ctr, Stuttgart, Germany
[2] Univ Tubingen, WSI GRIS, Tubingen, Germany
来源
PROCEEDINGS OF THE 18TH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING | 2010年
关键词
D O I
10.1109/PDP.2010.51
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present a parallel conjugate gradient solver for the Poisson problem optimized for multi-GPU platforms. Our approach includes a novel heuristic Poisson preconditioner well suited for massively-parallel SIMD processing. Furthermore, we address the problem of limited transfer rates over typical data channels such as the PCI-express bus relative to the bandwidth requirements of powerful GPUs. Specifically, naive communication schemes can severely reduce the achievable speedup in such communication-intense algorithms. For this reason, we employ overlapping memory transfers to establish a high level of concurrency and to improve scalability. We have implemented our model on a high-performance workstation with multiple hardware accelerators. We discuss the mathematical principles, give implementation details, and present the performance and the scalability of the system.
引用
收藏
页码:583 / 592
页数:10
相关论文
共 22 条
[1]  
[Anonymous], 2001, Proc. 2001 ACM/IEEE Conf. Supercomput.-Supercomput.'01, DOI 10.1145/582034.582089
[2]  
[Anonymous], 1994, INTRO CONJUGATE GRAD
[3]  
[Anonymous], 2009, CUDA
[4]  
[Anonymous], 2008, P 2008 ACM SIGGRAPHE
[5]  
[Anonymous], 2003, ITERATIVE METHODS SP, DOI DOI 10.1137/1.9780898718003
[6]   A sparse approximate inverse preconditioner for the conjugate gradient method [J].
Benzi, M ;
Meyer, CD ;
Tuma, M .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1996, 17 (05) :1135-1149
[7]   Sparse matrix solvers on the GPU:: Conjugate gradients and multigrid [J].
Bolz, J ;
Farmer, I ;
Grinspun, E ;
Schröder, P .
ACM TRANSACTIONS ON GRAPHICS, 2003, 22 (03) :917-924
[8]  
Bridson R., 2007, ACM SIGGRAPH 2007 courses, P1
[9]   Concurrent number cruncher: a GPU implementation of a general sparse linear solver [J].
Buatois, Luc ;
Caumon, Guillaume ;
Levy, Bruno .
INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2009, 24 (03) :205-223
[10]  
Cevahir A, 2009, LECT NOTES COMPUT SC, V5544, P893, DOI 10.1007/978-3-642-01970-8_90