Performance Analysis of Preconditioned Conjugate Gradient Solver on Heterogeneous (Multi-CPUs/Multi-GPUs) Architecture

被引:0
作者
Kasmi, Najlae [1 ]
Zbakh, Mostapha [1 ]
Haouari, Amine [1 ]
机构
[1] Mohammed V Univ, ENSIAS, Rabat, Morocco
来源
CLOUD COMPUTING AND BIG DATA: TECHNOLOGIES, APPLICATIONS AND SECURITY | 2019年 / 49卷
关键词
D O I
10.1007/978-3-319-97719-5_20
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The solution of systems of linear equations is one of the most central processing unit-intensive steps in engineering and simulation applications and can greatly benefit from the multitude of processing cores and vectorisation on today's parallel computers. Our objective is to evaluate the performance of one of them, the conjugate gradient method, on a hybrid computing platform (Multi-GPU/Multi-CPU). We consider the preconditioned conjugate gradient solver (PCG) since it exhibits the main features of such problems. Indeed, the relative performance of CPU and GPU highly depends on the sub-routine: GPUs are for instance much more efficient to process regular kernels such as matrix vector multiplications rather than more irregular kernels such as matrix factorization. In this context, one solution consists in relying on dynamic scheduling and resource allocation mechanisms such as the ones provided by StarPU. In this chapter we evaluate the performance of dynamic schedulers proposed by StarPU, and we analyse the scalability of PCG algorithm. We show how effectively we can choose the best combination of resources in order to improve their performance.
引用
收藏
页码:318 / 336
页数:19
相关论文
共 46 条
[1]   A Parallel Preconditioned Conjugate Gradient Solver for the Poisson Problem on a Multi-GPU Platform [J].
Ament, M. ;
Knittel, G. ;
Weiskopf, D. ;
Strasser, W. .
PROCEEDINGS OF THE 18TH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2010, :583-592
[2]   General purpose molecular dynamics simulations fully implemented on graphics processing units [J].
Anderson, Joshua A. ;
Lorenz, Chris D. ;
Travesset, A. .
JOURNAL OF COMPUTATIONAL PHYSICS, 2008, 227 (10) :5342-5359
[3]  
[Anonymous], 2011, GPU Computing Gems
[4]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theory of NP-Completeness
[5]  
[Anonymous], 2014, CUD C PROGR GUID VER
[6]   StarPU: a unified platform for task scheduling on heterogeneous multicore architectures [J].
Augonnet, Cedric ;
Thibault, Samuel ;
Namyst, Raymond ;
Wacrenier, Pierre-Andre .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (02) :187-198
[7]  
Bauer C, 2011, NY TIMES BK REV, P12
[8]  
Bell N, 2009, STUDENTS GUIDE TO THE MA TESOL, P1
[9]  
Boisvert RF, 1997, QUALITY OF NUMERICAL SOFTWARE - ASSESSMENT AND ENHANCEMENT, P125
[10]  
Bolz J., 2005, ACM SIGGRAPH 2005 Courses, P171