A comparison of the shared-memory parallel programming models OpenMP, OpenACC and Kokkos in the context of implicit solvers for high-order FEM

被引:16
作者
Eichstadt, Jan [1 ]
Vymazal, Martin [1 ]
Moxey, David [2 ]
Peiro, Joaquim [1 ]
机构
[1] Imperial Coll London, Dept Aeronaut, London, England
[2] Univ Exeter, Coll Engn Math & Phys Sci, Exeter, Devon, England
基金
英国工程与自然科学研究理事会; 欧盟地平线“2020”;
关键词
Shared-memory parallel programming models; OpenMP; OpenACC; Kokkos; Helmholtz equation; FEM; PERFORMANCE; BENCHMARK; FRAMEWORK;
D O I
10.1016/j.cpc.2020.107245
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We consider the application of three performance-portable programming models in the context of a high-order spectral element, implicit time-stepping solver for the Navier-Stokes equations. We aim to evaluate whether the use of these models allows code developers to deliver high-performance solvers for computational fluid dynamics simulations that are capable of effectively utilising both many-core CPU and GPU architectures. Using the core elliptic solver for the Navier-Stokes equations as a benchmarking guide, we evaluate the performance of these models on a range of unstructured meshes and give guidelines for the translation of existing codebases and their data structures to these models. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页数:15
相关论文
共 25 条
[1]   The deal.II library, Version 9.0 [J].
Alzetta, Giovanni ;
Arndt, Daniel ;
Bangerth, Wolfgang ;
Boddu, Vishal ;
Brands, Benjamin ;
Davydov, Denis ;
Gassmoller, Rene ;
Heister, Timo ;
Heltai, Luca ;
Kormann, Katharina ;
Kronbichler, Martin ;
Maier, Matthias ;
Pelteret, Jean-Paul ;
Turcksin, Bruno ;
Wells, David .
JOURNAL OF NUMERICAL MATHEMATICS, 2018, 26 (04) :173-183
[2]  
Bastian P, 2016, LECT NOTES COMP SCI, V113, P3, DOI 10.1007/978-3-319-40528-5_1
[3]   Nektar plus plus : An open-source spectral/hp element framework [J].
Cantwell, C. D. ;
Moxey, D. ;
Comerford, A. ;
Bolis, A. ;
Rocco, G. ;
Mengaldo, G. ;
De Grazia, D. ;
Yakovlev, S. ;
Lombard, J. -E. ;
Ekelschot, D. ;
Jordi, B. ;
Xu, H. ;
Mohamied, Y. ;
Eskilsson, C. ;
Nelson, B. ;
Vos, P. ;
Biotto, C. ;
Kirby, R. M. ;
Sherwin, S. J. .
COMPUTER PHYSICS COMMUNICATIONS, 2015, 192 :205-219
[4]   From h to p efficiently: Strategy selection for operator evaluation on hexahedral and tetrahedral elements [J].
Cantwell, C. D. ;
Sherwin, S. J. ;
Kirby, R. M. ;
Kelly, P. H. J. .
COMPUTERS & FLUIDS, 2011, 43 (01) :23-28
[5]  
Che SA, 2009, I S WORKL CHAR PROC, P44, DOI 10.1109/IISWC.2009.5306797
[6]  
Danalis A., 2010, GPGPU 3, P63, DOI [10.1145/1735688.1735702, DOI 10.1145/1735688.1735702]
[7]  
Dong T., 2016, 082016 ICL U TENN, P37996
[8]   The LINPACK benchmark: past, present and future [J].
Dongarra, JJ ;
Luszczek, P ;
Petitet, A .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2003, 15 (09) :803-820
[9]   Kokkos: Enabling manycore performance portability through polymorphic memory access patterns [J].
Edwards, H. Carter ;
Trott, Christian R. ;
Sunderland, Daniel .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2014, 74 (12) :3202-3216
[10]   Accelerating high-order mesh optimisation with an architecture-independent programming model [J].
Eichstadt, Jan ;
Green, Mashy ;
Turner, Michael ;
Peiro, Joaquim ;
Moxey, David .
COMPUTER PHYSICS COMMUNICATIONS, 2018, 229 :36-53