Early experience on porting and running a Lattice Boltzmann code on the Xeon-Phi co-processor

被引:36
作者
Crimi, G. [1 ]
Mantovani, F. [2 ]
Pivanti, M. [5 ,6 ]
Schifano, S. F. [4 ,7 ]
Tripiccione, R. [3 ,4 ]
机构
[1] Univ Ferrara, I-44100 Ferrara, Italy
[2] Univ Regensburg, Fac Phys, Regensburg, Germany
[3] Univ Ferrara, Dipt Fis & Sci della Terra & CMCS, Ferrara, Italy
[4] Univ Ferrara, INFN, Ferrara, Italy
[5] Univ Roma La Sapienza, Dipartimento Fis, Rome, Italy
[6] Univ Roma La Sapienza, INFN, Rome, Italy
[7] Univ Ferrara, Dipt Matemat & Informat, Ferrara, Italy
来源
2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE | 2013年 / 18卷
关键词
Lattice Boltzmann; Many-core systems; Performance optimization;
D O I
10.1016/j.procs.2013.05.219
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper we report on our early experience on porting, optimizing and benchmarking a Lattice Boltzmann (LB) code on the Xeon-Phi co-processor, the first generally available version of the new Many Integrated Core (MIC) architecture, developed by Intel. We consider as a test-bed a state-of-the-art LB model, that accurately reproduces the thermo-hydrodynamics of a 2D-fluid obeying the equations of state of a perfect gas. The regular structure of LB algorithms makes it relatively easy to identify a large degree of available parallelism. However, mapping a large fraction of this parallelism onto this new class of processors is not straightforward. The D2Q37 LB algorithm considered in this paper is an appropriate test-bed for this architecture since the critical computing kernels require high performances both in terms of memory bandwidth for sparse memory access patterns and number crunching capability. We describe our implementation of the code, that builds on previous experience made on other (simpler) many-core processors and GPUs, present benchmark results and measure performances, and finally compare with the results obtained by previous implementations developed on state-of-the-art classic multi-core CPUs and GP-GPUs.
引用
收藏
页码:551 / 560
页数:10
相关论文
共 9 条
[1]  
Bertazzo A., 2012, PROCEEDINGS OF INNOV
[2]  
Biferale L., 2011, PROCEDIA COMPUTER SC, V4, p[994, 1003]
[3]  
Biferale L., 2012, COMPUTERS AND FLUIDS
[4]  
Biferale L, 2012, LECT NOTES COMPUT SC, V7203, P640, DOI 10.1007/978-3-642-31464-3_65
[5]  
Pohl T., 2003, PARALLEL PROCESSING, V13, P549, DOI [DOI 10.1142/S0129626403001501, 10.1142/s0129626403001501]
[6]   Lattice Boltzmann method with self-consistent thermo-hydrodynamic equilibria [J].
Sbragaglia, M. ;
Benzi, R. ;
Biferale, L. ;
Chen, H. ;
Shan, X. ;
Succi, S. .
JOURNAL OF FLUID MECHANICS, 2009, 628 :299-309
[7]   Lattice Boltzmann methods for thermal flows: Continuum limit and applications to compressible Rayleigh-Taylor systems [J].
Scagliarini, A. ;
Biferale, L. ;
Sbragaglia, M. ;
Sugiyama, K. ;
Toschi, F. .
PHYSICS OF FLUIDS, 2010, 22 (05) :1-21
[8]  
Succi S., 2001, The lattice Boltzmann equation: For fluid dynamics and beyond
[9]   On the single processor performance of simple lattice Boltzmann kernels [J].
Wellein, G. ;
Zeiser, T. ;
Hager, G. ;
Donath, S. .
COMPUTERS & FLUIDS, 2006, 35 (8-9) :910-919