A parallel MPI plus OpenMP plus OpenCL algorithm for hybrid supercomputations of incompressible flows

被引：22

作者：

Gorobets, A. V. ^{[1
,2
]}

Trias, F. X. ^{[1
]}

Oliva, A. ^{[1
]}

机构：

[1] Tech Univ Catalonia, ETSEIAT, Heat & Mass Transfer Technol Ctr, Terrassa 08222, Spain

[2] Keldysh Inst Appl Math, Moscow 125047, Russia

来源：

COMPUTERS & FLUIDS | 2013年 / 88卷

关键词：

MPI; OpenMP; OpenCL; GPU; Parallel CFD; Turbulence; SCHUR-FOURIER DECOMPOSITION; GPU; COMPUTERS; SOLVER;

D O I：

10.1016/j.compfluid.2013.05.021

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The work is devoted to the development of efficient parallel algorithms for large-scale simulations of incompressible flows on hybrid supercomputers based on massively-parallel accelerators. The governing equations are discretized using a high-order finite-volume scheme for Cartesian staggered meshes with the only restriction that, at least, one direction is periodic. Its "classical" MPI + OpenMP parallel implementation for CPUs was designed to scale till 100,000 CPU cores. The new hybrid algorithm is developed on a base of a multi-level parallel model that exploits several layers of parallelism of a modern hybrid supercomputer. In this model, MPI and OpenMP are used on the first two levels to couple nodes of a supercomputer and to engage its CPU cores. Then, computing accelerators are further used by means of the hardware independent OpenCL computing standard. In this way, the implementation is adapted to a general computing model with central processors and math co-processors. In this paper the work is focused on adapting the basic operations of the algorithm to architectures of Graphics Processing Units (GPU) without considering the multi-CPU communication scheme. Technology of porting the code to OpenCL is described, certain optimization approaches are presented and relevant performance results obtaining up to 80-90 GFLOPS on a GPU accelerator are demonstrated. Moreover, the experience with different CPU architectures is summarized and a comparison based on the particular application is given for AMD and NVIDIA GPUs as well as for CUDA and OpenCL frameworks. (C) 2013 Elsevier Ltd. All rights reserved.

引用

页码：764 / 772

页数：9

共 50 条

[21] Parallelization of Reverse Time Migration Using MPI plus OpenMP
Akanksha, Kansara S.
Kumar, Gardas Naresh
PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES (ICACCCT), 2016, : 695 - 697
[22] Hybrid MPI plus UPC parallel programming paradigm on an SMP cluster
Bozkus, Zeki
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2012, 20 : 1389 - 1407
[23] A novel intuitionistic-near fuzzy sets based image fusion approach: development on hybrid MPI plus OpenMP parallel model
Biswas, Biswajit
Ghosh, Swarup Kr
Ghosh, Anupam
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (21) : 29699 - 29730
[24] Collectives in hybrid MPI plus MPI code: Design, practice and performance
Zhou, Huan
Gracia, Jose
Zhou, Naweiluo
Schneider, Ralf
PARALLEL COMPUTING, 2020, 99
[25] MPI Collectives for Multi-core Clusters: Optimized Performance of the Hybrid MPI plus MPI Parallel Codes
Zhou, Huan
Gracia, Jose
Schneider, Ralf
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPP 2019), 2019,
[26] Parallization of Adaboost Algorithm Through Hybrid MPI/OpenMP and Transactional Memory
Zeng, Kun
Tang, Yuhua
Liu, Fudong
PROCEEDINGS OF THE 19TH INTERNATIONAL EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING, 2011, : 94 - 100
[27] Hybrid MPI/OpenMP parallel asynchronous distributed alternating direction method of multipliers
Dongxia Wang
Yongmei Lei
Jianhui Zhou
Computing, 2021, 103 : 2737 - 2762
[28] MPI plus OpenMP Tasking Scalability for the Simulation of the Human Brain Human Brain Project
Valero-Lara, Pedro
Sirvent, Raul
Pena, Antonio J.
Martorell, Xavier
Labarta, Jesus
EUROMPI 2018: PROCEEDINGS OF THE 25TH EUROPEAN MPI USERS' GROUP MEETING, 2018,
[29] Hybrid MPI/OpenMP parallel asynchronous distributed alternating direction method of multipliers
Wang, Dongxia
Lei, Yongmei
Zhou, Jianhui
COMPUTING, 2021, 103 (12) : 2737 - 2762
[30] MPI plus OpenMP tasking scalability for multi-morphology simulations of the human brain
Valero-Lara, Pedro
Sirvent, Raul
Pena, Antonio J.
Labarta, Jesus
PARALLEL COMPUTING, 2019, 84 : 50 - 61

← 1 2 3 4 5 →