Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units

被引：0

作者：

XIONG QinGang1

2 Graduate University of Chinese Academy of Sciences

机构：

来源：

Science Bulletin | 2012年 / 07期

基金：

中国国家自然科学基金;

关键词：

asynchronous execution; compute unified device architecture; graphic processing unit; lattice Boltzmann method; non-blocking message passing interface; OpenMP;

D O I：

暂无

中图分类号：

TP391.41 [];

学科分类号：

080203 ;

摘要：

Many-core processors, such as graphic processing units (GPUs), are promising platforms for intrinsic parallel algorithms such as the lattice Boltzmann method (LBM). Although tremendous speedup has been obtained on a single GPU compared with mainstream CPUs, the performance of the LBM for multiple GPUs has not been studied extensively and systematically. In this article, we carry out LBM simulation on a GPU cluster with many nodes, each having multiple Fermi GPUs. Asynchronous execution with CUDA stream functions, OpenMP and non-blocking MPI communication are incorporated to improve efficiency. The algorithm is tested for two-dimensional Couette flow and the results are in good agreement with the analytical solution. For both the oneand two-dimensional decomposition of space, the algorithm performs well as most of the communication time is hidden. Direct numerical simulation of a two-dimensional gas-solid suspension containing more than one million solid particles and one billion gas lattice cells demonstrates the potential of this algorithm in large-scale engineering applications. The algorithm can be directly extended to the three-dimensional decomposition of space and other modeling methods including explicit grid-based methods.

引用

页码：707 / 715

页数：9

共 22 条

[1] 耦合Nvidia/AMD两类GPU的格子玻尔兹曼模拟
李博
李曦鹏
张云
陈飞国
徐骥
王小伟
何险峰
王健
葛蔚
李静海
[J]. 科学通报, 2009, 54 (20) : 3177 - 3184
[2] Meso-scale oriented simulation to- wards virtual process engineering (VPE)—The EMMS paradigm. Ge W,Wang W,Yang N, et al. Chemical Engineering Science . 2011
[3] 单相流动数值模拟的SIMPLE算法在GPU上的实现
王健
许明
葛蔚
李静海
[J]. 科学通报, 2010, (20) : 1979 - 1986
[4] Molecular dynamics simulation of complex multiphase flow on a computer cluster with GPUs
Chen FeiGuo
Ge Wei
Li JingHai
[J]. SCIENCE IN CHINA SERIES B-CHEMISTRY, 2009, 52 (03): : 372 - 380
[5] Meso-scale oriented simulation towards virtual process engineering (VPE)-The EMMS Paradigm
Ge, Wei
Wang, Wei
Yang, Ning
Li, Jinghai
Kwauk, Mooson
Chen, Feiguo
Chen, Jianhua
Fang, Xiaojian
Guo, Li
He, Xianfeng
Liu, Xinhua
Liu, Yaning
Lu, Bona
Wang, Jian
Wang, Junwu
Wang, Limin
Wang, Xiaowei
Xiong, Qingang
Xu, Ming
Deng, Lijuan
Han, Yongsheng
Hou, Chaofeng
Hua, Leina
Huang, Wenlai
Li, Bo
Li, Chengxiang
Li, Fei
Ren, Ying
Xu, Ji
Zhang, Nan
Zhang, Yun
Zhou, Guofeng
Zhou, Guangzheng
[J]. CHEMICAL ENGINEERING SCIENCE, 2011, 66 (19) : 4426 - 4458
[6] Direct numerical simulation of sub-grid structures in gas–solid flow—GPU implementation of macro-scale pseudo-particle modeling[J] . Qingang Xiong,Bo Li,Feiguo Chen,Jingsen Ma,Wei Ge,Jinghai Li. &nbspChemical Engineering Science . 2010 (19)
[7] LBM based flow simulation using GPU computing processor. Kuznik F,Obrecht C,Rusaouen G, et al. Computers and Mathematics With Applications . 2010
[8] Performance analysis of single-phase, multiphase, and multicomponent lattice-Boltzmann fluid flow simu- lations on GPU clusters. Myre J,Walsh S,Lilja D, et al. Concurr Comp-Pract E . 2010
[9] NVIDIA CUDA compute unified device architecture Pro- gramming Guide Version 3.1. NVIDIA. . 2010
[10] Multi-scale Discrete Simulation Paral- lel Computing Based on GPU. Ge W,Chen F,Meng F, et al. . 2009

← 1 2 3 →