A portable OpenCL implementation of generic particle-mesh and mesh-particle interpolation in 2D and 3D

被引:5
作者
Bueyuekkececi, Ferit
Awile, Omar
Sbalzarini, Ivo F.
机构
[1] ETH, Inst Theoret Comp Sci, MOSA Grp, CH-8092 Zurich, Switzerland
[2] ETH, Swiss Inst Bioinformat, CH-8092 Zurich, Switzerland
基金
瑞士国家科学基金会;
关键词
OpenCL; GPGPU; Particle-mesh method; Interpolation; PIC method; PPM library; GPU; SIMULATIONS; LIBRARY; VORTEX; PPM;
D O I
10.1016/j.parco.2012.12.001
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Hybrid particle-mesh methods provide a versatile framework for simulating discrete and continuous systems. A key component is the forward and backward interpolation of particle data to mesh nodes. These interpolations typically account for a significant portion of the computational cost of a simulation. Due to its regular compute structure, interpolation admits SIMD parallelism, and several CPU-accelerated implementations have been presented in the literature. We build on these works to develop a streaming-parallel algorithm for interpolation in hybrid particle-mesh methods that works in both 2D and 3D and is free of assumptions about the particle density, the number of particle properties to be interpolated, and the particle indexing scheme. We provide a portable OpenCL implementation of the algorithm and benchmark its accuracy and performance. We show that with such a generic algorithm speedups of up to 15x over an 8-core multi-thread CPU implementation are possible if the data ire already available on the GPU. The maximum speedup reduces to about 7x if the data first have to be transferred to the GPU. The benchmarks also expose several limitations of GPU acceleration, in particular for low-order and 2D interpolation schemes. The present algorithm is integrated and available in the open-source Parallel Particle Mesh (PPM) library as a hybrid MPI-OpenCL implementation. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:94 / 111
页数:18
相关论文
共 28 条
[1]  
[Anonymous], 2016, Programming massively parallel processors: a hands-on approach
[2]  
[Anonymous], 2009, OPENCL SPECIFICATION
[3]  
Awile O., 2012, COMPUT PHYS COMMUN
[4]   Toward an Object-Oriented Core of the PPM Library [J].
Awile, Omar ;
Demirel, Oemer ;
Sbalzarini, Ivo F. .
NUMERICAL ANALYSIS AND APPLIED MATHEMATICS, VOLS I-III, 2010, 1281 :1313-+
[5]   A HIERARCHICAL O(N-LOG-N) FORCE-CALCULATION ALGORITHM [J].
BARNES, J ;
HUT, P .
NATURE, 1986, 324 (6096) :446-449
[6]   Multilevel adaptive particle methods for convection-diffusion equations [J].
Bergdorf, M ;
Cottet, GH ;
Koumoutsakos, P .
MULTISCALE MODELING & SIMULATION, 2005, 4 (01) :328-357
[7]   GPU and APU computations of Finite Time Lyapunov Exponent fields [J].
Conti, Christian ;
Rossinelli, Diego ;
Koumoutsakos, Petros .
JOURNAL OF COMPUTATIONAL PHYSICS, 2012, 231 (05) :2229-2244
[8]  
Cottet G.-H., 2000, Vortex Methods: Theory and Practice
[9]  
Du P., 2011, PARALLEL COMPUT
[10]   Efficient High-Quality Volume Rendering of SPH Data [J].
Fraedrich, Roland ;
Auer, Stefan ;
Westermann, Ruediger .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2010, 16 (06) :1533-1540