Enhancing the Performance of Conjugate Gradient Solvers on Graphic Processing Units

被引:30
作者
Dehnavi, Maryam Mehri [1 ]
Fernandez, David M. [1 ]
Giannacopoulos, Dennis [1 ]
机构
[1] McGill Univ, Elect & Comp Engn Dept, Montreal, PQ H3A 2A7, Canada
关键词
Computer architecture; conjugate gradients (CGs); graphic processing units (GPUs); parallel processing;
D O I
10.1109/TMAG.2010.2081662
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Astudy of the fundamental obstacles to accelerate the preconditioned conjugate gradient (PCG) method on modern graphic processing units (GPUs) is presented and several techniques are proposed to enhance its performance over previous work independent of the GPU generation and the matrix sparsity pattern. The proposed enhancements increase the performance of PCG up to 23 times compared to vector optimized PCG results on modern CPUs and up to 3.4 times compared to previous GPU results.
引用
收藏
页码:1162 / 1165
页数:4
相关论文
共 12 条
[1]  
[Anonymous], NVIDIA Cuda
[2]  
[Anonymous], 1994, An Introduction to the Conjugate Gradient Method Without the Agonizing Pain
[3]  
Bell Nathan, 2008, EFFICIENT SPARSE MAT
[4]  
BUATOIS L, 2007, P HPCC, P358
[5]  
CEVAHIR A, 2010, J RES DEV, V5, P83
[6]   S-STEP ITERATIVE METHODS FOR SYMMETRIC LINEAR-SYSTEMS [J].
CHRONOPOULOS, AT ;
GEAR, CW .
JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 1989, 25 (02) :153-168
[7]   Finite-Element Sparse Matrix Vector Multiplication on Graphic Processing Units [J].
Dehnavi, Maryam Mehri ;
Fernandez, David M. ;
Giannacopoulos, Dennis .
IEEE TRANSACTIONS ON MAGNETICS, 2010, 46 (08) :2982-2985
[8]   Multicore Acceleration of CG Algorithms Using Blocked-Pipeline-Matching Techniques [J].
Fernandez, David M. ;
Giannacopoulos, Dennis ;
Gross, Warren J. .
IEEE TRANSACTIONS ON MAGNETICS, 2010, 46 (08) :3057-3060
[9]  
GEORGESCU S, 2007, IWAPT
[10]  
GODDEKE D, 2005, ASIM