Multi-level parallelism for incompressible flow computations on GPU clusters

被引:56
|
作者
Jacobsen, Dana A. [1 ]
Senocak, Inanc [2 ]
机构
[1] Boise State Univ, Dept Comp Sci, Boise, ID 83725 USA
[2] Boise State Univ, Dept Mech & Biomed Engn, Boise, ID 83725 USA
基金
美国国家科学基金会;
关键词
GPU; Hybrid MPI-OpenMP-CUDA; Fluid dynamics; MPI; PERFORMANCE;
D O I
10.1016/j.parco.2012.10.002
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We investigate multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA parallel implementations, in which all computations are done on the GPU using CUDA. We explore efficiency and scalability of incompressible flow computations using up to 256 GPUs on a problem with approximately 17.2 billion cells. Our work addresses some of the unique issues faced when merging fine-grain parallelism on the CPU using CUDA with coarse-grain parallelism that use either MPI or MPI-OpenMP for communications. We present three different strategies to overlap computations with communications, and systematically assess their impact on parallel performance on two different CPU clusters. Our results for strong and weak scaling analysis of incompressible flow computations demonstrate that CPU clusters offer significant benefits for large data sets, and a dual-level MPI-CUDA implementation with maximum overlapping of computation and communication provides substantial benefits in performance. We also find that our tri-level MPI-OpenMP-CUDA parallel implementation does not offer a significant advantage in performance over the dual-level implementation on CPU clusters with two GPUs per node, but on clusters with higher CPU counts per node or with different domain decomposition strategies a tri-level implementation may exhibit higher efficiency than a dual-level implementation and needs to be investigated further. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 50 条
  • [1] Scalable State Space Search on the GPU with Multi-Level Parallelism
    Shipovalov, Egor
    Pryanichnikov, Valentin
    2020 19TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC 2020), 2020, : 84 - 92
  • [2] Grounded Multi-Level Computations
    Henno, Jaak
    INFORMATION MODELLING AND KNOWLEDGE BASES XXVI, 2014, 272 : 140 - 151
  • [3] Multi-level graph layout on the GPU
    Frishman, Yaniv
    Tal, Ayellet
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2007, 13 (06) : 1310 - 1317
  • [4] Multi-Level Parallelism for the Cardiac Bidomain Equations
    Carolina Ribeiro Xavier
    Rafael Sachetto Oliveira
    Vinicius da Fonseca Vieira
    Rodrigo Weber dos Santos
    Wagner Meira
    International Journal of Parallel Programming, 2009, 37 : 572 - 592
  • [5] Multi-level parallelism in the computational modeling of the heart
    Xavier, Carolina
    Sachetto, Rafael
    Vieira, Vinicius
    dos Santos, Rodrigo Weber
    Meira, Wagner
    19TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 2007, : 3 - +
  • [6] Multi-Level Parallelism for the Cardiac Bidomain Equations
    Xavier, Carolina Ribeiro
    Oliveira, Rafael Sachetto
    Vieira, Vinicius da Fonseca
    dos Santos, Rodrigo Weber
    Meira, Wagner, Jr.
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2009, 37 (06) : 572 - 592
  • [7] Multi-level parallelism for protein prediction on the parallel computers
    Chen, J.
    Mo, Z. Y.
    Song, L.
    MOLECULAR & CELLULAR PROTEOMICS, 2004, 3 (10) : S248 - S248
  • [8] The Introduction of Multi-level Parallelism Solvers in Multibody Dynamics
    Andreev, Andrey
    Egunov, Vitaly
    Movchan, Evgenia
    Cherednikov, Nikita
    Kharkov, Egor
    Kohtashvili, Natalia
    CREATIVITY IN INTELLIGENT TECHNOLOGIES AND DATA SCIENCE, PT II, 2019, 1084 : 166 - 180
  • [9] Exploring multi-level parallelism in cellular automata networks
    Calidonna, CR
    Di Napoli, C
    Giordano, M
    Furnari, MM
    HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 2000, 1940 : 336 - 343
  • [10] Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms
    Julien C. Thibault
    Inanc Senocak
    The Journal of Supercomputing, 2012, 59 : 693 - 719