A massively parallel GPU-accelerated model for analysis of fully nonlinear free surface waves

被引:44
|
作者
Engsig-Karup, A. P. [1 ]
Madsen, Morten G. [1 ]
Glimberg, Stefan L. [1 ]
机构
[1] Tech Univ Denmark, Dept Informat & Math Modeling, DK-2800 Lyngby, Denmark
关键词
nonlinear water waves; coastal and offshore engineering; finite difference method; potential flow; time domain; scientific GPU computations; high-performance computing; WATER-WAVES;
D O I
10.1002/fld.2675
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We implement and evaluate a massively parallel and scalable algorithm based on a multigrid preconditioned Defect Correction method for the simulation of fully nonlinear free surface flows. The simulations are based on a potential model that describes wave propagation over uneven bottoms in three space dimensions and is useful for fast analysis and prediction purposes in coastal and offshore engineering. A dedicated numerical model based on the proposed algorithm is executed in parallel by utilizing affordable modern special purpose graphics processing unit (GPU). The model is based on a low-storage flexible-order accurate finite difference method that is known to be efficient and scalable on a CPU core (single thread). To achieve parallel performance of the relatively complex numerical model, we investigate a new trend in high-performance computing where many-core GPUs are utilized as high-throughput co-processors to the CPU. We describe and demonstrate how this approach makes it possible to do fast desktop computations for large nonlinear wave problems in numerical wave tanks (NWTs) with close to 50/100 million total grid points in double/single precision with 4?GB global device memory available. A new code base has been developed in C++ and compute unified device architecture C and is found to improve the runtime more than an order in magnitude in double precision arithmetic for the same accuracy over an existing CPU (single thread) Fortran 90 code when executed on a single modern GPU. These significant improvements are achieved by carefully implementing the algorithm to minimize data-transfer and take advantage of the massive multi-threading capability of the GPU device. Copyright (c) 2011 John Wiley & Sons, Ltd.
引用
收藏
页码:20 / 36
页数:17
相关论文
共 50 条
  • [21] A GPU-accelerated parallel K-means algorithm
    Cuomo, S.
    De Angelis, V.
    Farina, G.
    Marcellino, L.
    Toraldo, G.
    COMPUTERS & ELECTRICAL ENGINEERING, 2019, 75 : 262 - 274
  • [22] GPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis
    He, Kai
    Tan, Sheldon X. -D.
    Wang, Hai
    Shi, Guoyong
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2016, 24 (03) : 1140 - 1150
  • [23] Engineering a Fully GPU-Accelerated H.264 Encoder
    Li, Bowei
    Deng, Yangdong Steve
    FIFTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2013), 2013, 8878
  • [24] A Performance Model for GPU-Accelerated FDTD Applications
    Baumeister, Paul F.
    Hater, Thorsten
    Kraus, Jiri
    Pleiter, Dirk
    Wahl, Pierre
    2015 IEEE 22ND INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2015, : 185 - 193
  • [25] Massively Parallelized Support Vector Machines based on GPU-Accelerated Multiplicative Updates
    Kou, Connie Khor Li
    Huang, Chao-Hui
    2014 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), 2014, : 431 - 438
  • [26] A single-phase GPU-accelerated surface tension model using SPH
    Cen, Chunze
    Fourtakas, Georgios
    Lind, Steven
    Rogers, Benedict D.
    COMPUTER PHYSICS COMMUNICATIONS, 2024, 295
  • [27] GPU-accelerated Parallel 3D Image Thinning
    Hu, Bingfeng
    Yang, Xuan
    2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), 2013, : 149 - 152
  • [28] GPU-Accelerated Graph Clustering via Parallel Label Propagation
    Kozawa, Yusuke
    Amagasa, Toshiyuki
    Kitagawa, Hiroyuki
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 567 - 576
  • [29] Accurate GPU-accelerated surface integrals for moment computation
    Krishnamurthy, Adarsh
    McMains, Sara
    COMPUTER-AIDED DESIGN, 2011, 43 (10) : 1284 - 1295
  • [30] Measurement and analysis of GPU-accelerated applications with HPCToolkit
    Zhou, Keren
    Adhianto, Laksono
    Anderson, Jonathon
    Cherian, Aaron
    Grubisic, Dejan
    Krentel, Mark
    Liu, Yumeng
    Meng, Xiaozhu
    Mellor-Crummey, John
    PARALLEL COMPUTING, 2021, 108