High-performance SIMD implementation of the lattice-Boltzmann method on the Xeon Phi processor

被引:4
|
作者
Robertsen, Fredrik [1 ,2 ]
Mattila, Keijo [3 ,4 ]
Westerholm, Jan [2 ]
机构
[1] CSC IT Ctr Sci, POB 405, FI-02101 Espoo, Finland
[2] Abo Akad Univ, Fac Sci & Engn, Vattenborgsvagen 3, FI-20500 Turku, Finland
[3] Univ Jyvaskyla, Fac Informat Technol, Jyvaskyla, Finland
[4] Tampere Univ Technol, Dept Phys, Tampere, Finland
来源
关键词
Lattice Boltzmann; prefetching; SIMD; Xeon Phi;
D O I
10.1002/cpe.5072
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a high-performance implementation of the lattice-Boltzmann method (LBM) on the Knights Landing generation of Xeon Phi. The Knights Landing architecture includes 16GB of high-speed memory (MCDRAM) with a reported bandwidth of over 400 GB/s, and a subset of the AVX-512 single instruction multiple data (SIMD) instruction set. We explain five critical implementation aspects for high performance on this architecture: (1) the choice of appropriate LBM algorithm, (2) suitable data layout, (3) vectorization of the computation, (4) data prefetching, and (5) running our LBM simulations exclusively from the MCDRAM. The effects of these implementation aspects on the computational performance are demonstrated with the lattice-Boltzmann scheme involving the D3Q19 discrete velocity set and the TRT collision operator. In our benchmark simulations of fluid flow through porous media, using double-precision floating-point arithmetic, the observed performance exceeds 960 million fluid lattice site updates per second.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Performance Evaluation of an OpenCL Implementation of the Lattice Boltzmann Method on the Intel Xeon Phi
    Obrecht, Christian
    Tourancheau, Bernard
    Kuznik, Frederic
    PARALLEL PROCESSING LETTERS, 2015, 25 (03)
  • [2] Accuracy of the lattice-Boltzmann method using the Cell processor
    Harvey, M. J.
    De Fabritiis, G.
    Giupponi, G.
    PHYSICAL REVIEW E, 2008, 78 (05):
  • [3] Early experience on porting and running a Lattice Boltzmann code on the Xeon-Phi co-processor
    Crimi, G.
    Mantovani, F.
    Pivanti, M.
    Schifano, S. F.
    Tripiccione, R.
    2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2013, 18 : 551 - 560
  • [4] Accuracy of the lattice-Boltzmann method
    Maier, RS
    Bernard, RS
    INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 1997, 8 (04): : 747 - 752
  • [5] VALIDATION OF AN ADAPTIVE MESHING IMPLEMENTATION OF THE LATTICE-BOLTZMANN METHOD FOR INSECT FLIGHT
    Feaster, Jeffrey
    Battaglia, Francine
    Deiterding, Ralf
    Bayandor, Javid
    PROCEEDINGS OF THE ASME FLUIDS ENGINEERING DIVISION SUMMER MEETING, 2016, VOL 1A, 2016,
  • [6] A dynamic boundary model for implementation of boundary conditions in lattice-Boltzmann method
    Kang, Jinfen
    Kang, Sangmo
    Suh, Yong Kweon
    JOURNAL OF MECHANICAL SCIENCE AND TECHNOLOGY, 2008, 22 (06) : 1192 - 1201
  • [7] A dynamic boundary model for implementation of boundary conditions in lattice-Boltzmann method
    Jinfen Kang
    Sangmo Kang
    Yong Kweon Suh
    Journal of Mechanical Science and Technology, 2008, 22 : 1192 - 1201
  • [8] Microfiber Filter Performance Prediction Using a Lattice-Boltzmann Method
    Xavier Augusto, Liliana de Luca
    Ross-Jones, Jesse
    Lopes, Gabriela Cantarelli
    Tronville, Paolo
    Silveira Goncalves, Jose Antonio
    Raedle, Matthias
    Krause, Mathias J.
    COMMUNICATIONS IN COMPUTATIONAL PHYSICS, 2018, 23 (04) : 910 - 931
  • [9] Performance evaluation of parallel SAT solver on xeon phi processor
    Nishiwaki S.
    Fujieda N.
    Ichikawa S.
    IEEJ Transactions on Industry Applications, 2019, 139 (02): : 119 - 126
  • [10] Comparison of implementations of the lattice-Boltzmann method
    Mattila, Keijo
    Hyvaeluoma, Jari
    Timonen, Jussi
    Rossi, Tuomo
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2008, 55 (07) : 1514 - 1524