Experiences Using CPUs and GPUs for Cooperative Computation in a Multi-Physics Simulation

被引:1
作者
Pearce, Olga [1 ]
机构
[1] Lawrence Livermore Natl Lab, Livermore, CA 94550 USA
来源
47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP '18) | 2018年
关键词
GPU; multi-physics; simulation; load balancing;
D O I
10.1145/3229710.3229711
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Top supercomputers in the TOP500 list have transitioned from homogeneous node architectures toward heterogeneous manycore nodes with accelerators and CPUs. These new architectures present significant challenges to developers of large-scale multiphysics applications, especially at DOE laboratories that have invested heavily in scalable MPI codes over decades. Much of these scientific application porting efforts for the new heterogeneous architectures are focused on running the computation on the accelerators, which usually comprise >90% of the FLOPS of the system. We describe an approach to utilizing the remaining FLOPS on a heterogeneous machine by running a portion of the computation on the CPUs cooperatively with the GPU computation. We present a proof-of-concept implementation in ARES, a multiphysics ALE-AMR code at LLNL. ARES uses a portability layer, RAJA, which enables us to utilize the same source code for both the CPU and the GPU. We develop an approach to utilize both types of processors cooperatively in a mixed-processor system. Our implementation divides the work between the computing resources via domain decomposition, and utilizes all cores of the CPU and all of the GPUs on the node for computation. Load balancing is necessary to use the heterogeneous resources effectively. We present preliminary results on early delivery pre-Sierra machines at LLNL, showing up to an 18% performance benefit of using the CPUs on the heterogeneous nodes for computing in addition to using the GPUs.
引用
收藏
页数:10
相关论文
共 18 条
[1]   Hybrid computing: CPU+GPU co-processing and its application to tomographic reconstruction [J].
Agulleiro, J. I. ;
Vazquez, F. ;
Garzon, E. M. ;
Fernandez, J. J. .
ULTRAMICROSCOPY, 2012, 115 :109-114
[2]  
Agullo Emmanuel, 2011, INT PAR DISTR PROC S
[3]  
[Anonymous], 2015, NVIDIA MULTIPROCESS
[4]  
[Anonymous], 2018, Top500 supercomputers
[5]  
[Anonymous], 2015, SHOCK WAVES
[6]  
Augonnet C., 2010, CONCURRENCY COMPUTAT
[7]  
Barker Kevin., 2008, SC 08
[8]   A Dynamic Self-Scheduling Scheme for Heterogeneous Multiprocessor Architectures [J].
Belviranli, Mehmet E. ;
Bhuyan, Laxmi N. ;
Gupta, Rajiv .
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2013, 9 (04)
[9]   A study of ALE simulations of Rayleigh-Taylor instability [J].
Darlington, RM ;
McAbee, TL ;
Rodrigue, G .
COMPUTER PHYSICS COMMUNICATIONS, 2001, 135 (01) :58-73
[10]  
Ding S, 2009, P 18 INT C WORLD WID