High-performance Shallow Water Model for Use on Massively Parallel and Heterogeneous Computing Systems

被引:0
|
作者
Chaplygin A.V. [1 ]
Gusev A.V. [1 ,2 ,3 ]
Diansky N.A. [1 ,3 ,4 ]
机构
[1] Marchuk Institute of Numerical Mathematics of the Russian Academy of Sciences, Moscow
[2] P.P. Shirshov Institute of Oceanology of the Russian Academy of Sciences, Moscow
[3] N.N. Zubov State Oceanographic Institute, Moscow
[4] Lomonosov Moscow State University, Moscow
基金
俄罗斯基础研究基金会;
关键词
Cuda; Heterogeneous computing systems; Mpi; Openmp; Shallow water; Supercomputer modeling;
D O I
10.14529/JSFI210407
中图分类号
学科分类号
摘要
This paper presents the shallow water model, formulated from the ocean general circulation sigma model INMOM (Institute of Numerical Mathematics Ocean Model). The shallow water model is based on software architecture, which separates the physics-related code from parallel implementation features, thereby simplifying the model’s support and development. As an improvement of the two-dimensional domain decomposition method, we present the blocked-based decomposition proposing load-balanced and cache-friendly calculations on CPUs. We propose various hybrid parallel programming patterns in the shallow water model for effective calculation on massively parallel and heterogeneous computing systems and evaluate their scaling performances on the Lomonosov-2 supercomputer. We demonstrate that performance per a single grid point on GPUs dramatically decreases for small grid sizes starting from 219 points per node, while performance on CPUs scales up to 217 well. Although, calculations on GPUs outperform calculations on CPUs by a factor of 4.7 at 30 nodes using 60 GPUs and 360 CPU cores at 6100 × 4460 grid size. We demonstrate that overlapping kernel execution with data transfers on GPUs increases performance by 28%. Furthermore, we demonstrate the advantage of using the load-balancing method in the Azov Sea model on CPUs and GPUs. © The Authors 2021. This paper is published with open access at SuperFri.org
引用
收藏
页码:74 / 93
页数:19
相关论文
共 50 条
  • [21] A heterogeneous mixed-mode execution model for massively parallel systems
    Noh, SH
    Dussa-Zieger, K
    Agrawala, AK
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1999, 56 (01) : 2 - 16
  • [22] Optimizing FHEW With Heterogeneous High-Performance Computing
    Lei, Xinya
    Guo, Ruixin
    Zhang, Feng
    Wang, Lizhe
    Xu, Rui
    Qu, Guangzhi
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2020, 16 (08) : 5335 - 5344
  • [23] High-performance parallel bio-computing
    Huang, CH
    PARALLEL COMPUTING, 2004, 30 (9-10) : 999 - 1000
  • [24] Sequence Alignment on Massively Parallel Heterogeneous Systems
    Drozd, Aleksandr
    Maruyama, Naoya
    Matsuoka, Satoshi
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 2498 - 2501
  • [25] A survey of software techniques to emulate heterogeneous memory systems in high-performance computing
    Foyer, Clement
    Goglin, Brice
    Proano, Andres Rubio
    PARALLEL COMPUTING, 2023, 116
  • [26] Resource and Energy Management in High-Performance Computing: From Heterogeneous to Exascale Systems
    Ahmad, Ishfaq
    2017 INTERNATIONAL CONFERENCE ON INFOCOM TECHNOLOGIES AND UNMANNED SYSTEMS (TRENDS AND FUTURE DIRECTIONS) (ICTUS), 2017, : 70 - 70
  • [27] FPGAs as Components in Heterogeneous High-Performance Computing Systems: Raising the Abstraction Level
    Vanderbauwhede, Wim
    Nabi, Syed Waqar
    PARALLEL COMPUTING: ON THE ROAD TO EXASCALE, 2016, 27 : 505 - 514
  • [28] Massively scalable prototype learning for heterogeneous parallel computing architecture
    Su T.
    Li S.
    Deng S.
    Yu Y.
    Bai W.
    Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2016, 48 (11): : 53 - 60
  • [29] Parallel Simulation of Tasks Scheduling and Scheduling Criteria in High-performance Computing Systems
    Skrinarova, Jarmila
    Povinsky, Michal
    JOURNAL OF INFORMATION AND ORGANIZATIONAL SCIENCES, 2019, 43 (02) : 211 - 228
  • [30] A Parallel Neuromorphic Text Recognition System and Its Implementation on a Heterogeneous High-Performance Computing Cluster
    Qiu, Qinru
    Wu, Qing
    Bishop, Morgan
    Pino, Robinson E.
    Linderman, Richard W.
    IEEE TRANSACTIONS ON COMPUTERS, 2013, 62 (05) : 886 - 899