High-performance Shallow Water Model for Use on Massively Parallel and Heterogeneous Computing Systems

被引:0
|
作者
Chaplygin A.V. [1 ]
Gusev A.V. [1 ,2 ,3 ]
Diansky N.A. [1 ,3 ,4 ]
机构
[1] Marchuk Institute of Numerical Mathematics of the Russian Academy of Sciences, Moscow
[2] P.P. Shirshov Institute of Oceanology of the Russian Academy of Sciences, Moscow
[3] N.N. Zubov State Oceanographic Institute, Moscow
[4] Lomonosov Moscow State University, Moscow
基金
俄罗斯基础研究基金会;
关键词
Cuda; Heterogeneous computing systems; Mpi; Openmp; Shallow water; Supercomputer modeling;
D O I
10.14529/JSFI210407
中图分类号
学科分类号
摘要
This paper presents the shallow water model, formulated from the ocean general circulation sigma model INMOM (Institute of Numerical Mathematics Ocean Model). The shallow water model is based on software architecture, which separates the physics-related code from parallel implementation features, thereby simplifying the model’s support and development. As an improvement of the two-dimensional domain decomposition method, we present the blocked-based decomposition proposing load-balanced and cache-friendly calculations on CPUs. We propose various hybrid parallel programming patterns in the shallow water model for effective calculation on massively parallel and heterogeneous computing systems and evaluate their scaling performances on the Lomonosov-2 supercomputer. We demonstrate that performance per a single grid point on GPUs dramatically decreases for small grid sizes starting from 219 points per node, while performance on CPUs scales up to 217 well. Although, calculations on GPUs outperform calculations on CPUs by a factor of 4.7 at 30 nodes using 60 GPUs and 360 CPU cores at 6100 × 4460 grid size. We demonstrate that overlapping kernel execution with data transfers on GPUs increases performance by 28%. Furthermore, we demonstrate the advantage of using the load-balancing method in the Azov Sea model on CPUs and GPUs. © The Authors 2021. This paper is published with open access at SuperFri.org
引用
收藏
页码:74 / 93
页数:19
相关论文
共 50 条
  • [31] Relaxations for High-Performance Message Passing on Massively Parallel SIMT Processors
    Klenk, Benjamin
    Froening, Holger
    Eberle, Hans
    Dennison, Larry
    2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 855 - 865
  • [32] Massively parallel modular exponentiation method and its implementation in software and hardware for high-performance cryptographic systems
    Nedjah, N.
    Mourelle, L. M.
    Santana, M.
    Raposo, S.
    IET COMPUTERS AND DIGITAL TECHNIQUES, 2012, 6 (05): : 290 - 301
  • [33] Comparison of genomes using high-performance parallel computing
    Almeida, NF
    Alves, CER
    Caceres, EN
    Song, SW
    15TH SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 2003, : 142 - 148
  • [34] High-performance parallel computing for incompressible flow simulations
    O. Byrde
    W. Couzy
    M. O. Deville
    M. L. Sawley
    Computational Mechanics, 1999, 23 : 98 - 107
  • [35] High-performance parallel computing for incompressible flow simulations
    Fluid Mechanics Laboratory, Ecl. Polytech. Federale de Lausanne, ME-Ecublens, CH-1015 Lausanne, Switzerland
    Comput Mech, 2 (98-107):
  • [36] The FPGA High-Performance Computing Alliance Parallel Toolkit
    Baxter, Rob
    Booth, Stephen
    Bull, Mark
    Cawood, Geoff
    Perry, James
    Parsons, Mark
    Simpson, Alan
    Trew, Arthur
    McCormick, Andrew
    Smart, Graham
    Smart, Ronnie
    Cantle, Allan
    Chamberlain, Richard
    Genest, Gildas
    NASA/ESA CONFERENCE ON ADAPTIVE HARDWARE AND SYSTEMS, PROCEEDINGS, 2007, : 301 - +
  • [37] Parallel language processing system for high-performance computing
    Yamanaka, E
    Shindo, T
    FUJITSU SCIENTIFIC & TECHNICAL JOURNAL, 1997, 33 (01): : 39 - 51
  • [38] Parallel language processing system for high-performance computing
    Yamanaka, Eiji
    Shindo, Tatsuya
    Fujitsu Scientific and Technical Journal, 1997, 33 (01): : 39 - 51
  • [39] High-performance parallel computing for incompressible flow simulations
    Byrde, O
    Couzy, W
    Deville, MO
    Sawley, ML
    COMPUTATIONAL MECHANICS, 1999, 23 (02) : 98 - 107
  • [40] High-performance parallel computing for stiffness equation of FEM
    Nippon Kikai Gakkai Ronbunshu A Hen, 603 (2468-2473):