Parallelization of 2D MPDATA EULAG algorithm on hybrid architectures with GPU accelerators

被引:22
作者
Wyrzykowski, Roman [1 ]
Szustak, Lukasz [1 ]
Rojek, Krzysztof [1 ]
机构
[1] Czestochowa Tech Univ, Inst Comp & Informat Sci, Czestochowa, Poland
关键词
MPDATA advection algorithm; Stencil computation; GPU accelerators; Hybrid CPU-GPU architectures; Hierarchical decomposition; Autotuning; ADVECTION TRANSPORT ALGORITHM; PERFORMANCE; MULTI; IMPLEMENTATION; SIMULATION;
D O I
10.1016/j.parco.2014.04.009
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
EULAG (Eulerian/semi-Lagrangian fluid solver) is an established computational model developed for simulating thermo-fluid flows across a wide range of scales and physical scenarios. The dynamic core of EULAG includes the multidimensional positive definite advection transport algorithm (MPDATA) and elliptic solver. In this work we investigate aspects of an optimal parallel version of the 2D MPDATA algorithm on modern hybrid architectures with GPU accelerators, where computations are distributed across both GPU and CPU components. Using the hybrid OpenMP-OpenCL model of parallel programming opens the way to harness the power of CPU-GPU platforms in a portable way. In order to better utilize features of such computing platforms, comprehensive adaptations of MPDATA computations to hybrid architectures are proposed. These adaptations are based on efficient strategies for memory and computing resource management, which allow us to ease memory and communication bounds, and better exploit the theoretical floating point efficiency of CPU-GPU platforms. The main contributions of the paper are: method for the decomposition of the 2D MPDATA algorithm as a tool to adapt MPDATA computations to hybrid architectures with GPU accelerators by minimizing communication and synchronization between CPU and GPU components at the cost of additional computations; method for the adaptation of 2D MPDATA computations to multicore CPU platforms, based on space and temporal blocking techniques; method for the adaptation of the 2D MPDATA algorithm to GPU architectures, based on a hierarchical decomposition strategy across data and computation domains, with support provided by the developed GPU task scheduler allowing for the flexible management of available resources; approach to the parametric optimization of 2D MPDATA computations on GPUs using the autotuning technique, which allows us to provide a portable implementation methodology across a variety of GPUs. Hybrid platforms tested in this study contain different numbers of CPUs and GPUs from solutions consisting of a single CPU and a single GPU to the most elaborate configuration containing two CPUs and two GPUs. Processors of different vendors are employed in these systems - both Intel and AMD CPUs, as well as GPUs from NVIDIA and AMD. For all the grid sizes and for all the tested platforms, the hybrid version with computations spread across CPU and GPU components allows us to achieve the highest performance. In particular, for the largest MPDATA grids used in our experiments, the speedups of the hybrid versions over GPU and CPU versions vary from 1.30 to 1.69, and from 1.95 to 2.25, respectively. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:425 / 447
页数:23
相关论文
共 50 条
  • [21] Photoconduction in 2D Single-Crystal Hybrid Perovskites
    Demontis, Valeria
    Durante, Ofelia
    Marongiu, Daniela
    De Stefano, Sebastiano
    Matta, Selene
    Simbula, Angelica
    Capello, Carlotta Ragazzo
    Pennelli, Giovanni
    Quochi, Francesco
    Saba, Michele
    Di Bartolomeo, Antonio
    Mura, Andrea
    Bongiovanni, Giovanni
    ADVANCED OPTICAL MATERIALS, 2025, 13 (06):
  • [22] Emerging Issues and Opportunities of 2D Layered Transition Metal Dichalcogenide Architectures for Supercapacitors
    Liu, Shude
    Zhang, Huilin
    Peng, Xue
    Chen, Jieming
    Kang, Ling
    Yin, Xia
    Yusuke, Yamauchi
    Ding, Bin
    ACS NANO, 2025, : 13591 - 13636
  • [23] GPU-accelerated approach for 2D fracture analysis of structures combining finite particle method and cohesive zone model
    Kang, Yufeng
    Zheng, Yanfeng
    Li, Siyuan
    Zhang, Jingyao
    Tang, Jingzhe
    Yang, Chao
    Luo, Yaozhi
    ENGINEERING FRACTURE MECHANICS, 2024, 306
  • [24] 2D Efficient Unconditionally Stable Meshless FDTD Algorithm
    Luo, Kang
    Yi, Yun
    Duan, Yantao
    Xu, Boao
    Chen, Bin
    INTERNATIONAL JOURNAL OF ANTENNAS AND PROPAGATION, 2016, 2016
  • [25] Discretization of 2D random fields: A genetic algorithm approach
    Allaix, Diego Lorenzo
    Carbone, Vincenzo Ilario
    ENGINEERING STRUCTURES, 2009, 31 (05) : 1111 - 1119
  • [26] 2D and 3D A* Algorithm Comparison for UAS Traffic Management Systems
    Potter Neto, Carlos Augusto
    Bertoli, Gustavo de Carvalho
    Saotome, Osamu
    2020 INTERNATIONAL CONFERENCE ON UNMANNED AIRCRAFT SYSTEMS (ICUAS'20), 2020, : 72 - 76
  • [27] An Algorithm to Generate Synthetic 3D Microstructures from 2D Exemplars
    Ashton, Tristan N.
    Guillen, Donna Post
    Harris, William H.
    JOM, 2020, 72 (01) : 65 - 74
  • [28] Evaluation of a hybrid-log 2D wavelet image transform
    Wisdom, Michael
    Lee, Peter
    PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, 2007, : 411 - +
  • [29] Hybrid Dion-Jacobson 2D Lead Iodide Perovskites
    Mao, Lingling
    Ke, Weijun
    Pedesseau, Laurent
    Wu, Yilei
    Katan, Claudine
    Even, Jacky
    Wasielewski, Michael R.
    Stoumpos, Constantinos C.
    Kanatzidis, Mercouri G.
    JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 2018, 140 (10) : 3775 - 3783
  • [30] Conducting and superhydrophobic hybrid 2D material from coronene and pyrene
    Arya, Jyothibabu Sajila
    Mahato, Malay Krishna
    Sankararaman, Sethuraman
    Prasad, Edamana
    JOURNAL OF MATERIALS CHEMISTRY C, 2021, 9 (32) : 10324 - 10333