High-performance direct gravitational N-body simulations on graphics processing units

被引:71
作者
Portegies Zwart, Simon [1 ]
Belleman, Robert G.
Geldof, Peter M.
机构
[1] Univ Amsterdam, Sect Computat Sci, Amsterdam, Netherlands
[2] Univ Amsterdam, Astron Inst Anton Pannekoek, Amsterdam, Netherlands
关键词
gravitation; stellar dynamics; methods : N-body simulation; methods : numerical;
D O I
10.1016/j.newast.2007.05.004
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
We present the results of gravitational direct N-body simulations using the commercial graphics processing units (GPU) NVIDIA uadro FX1400 and GeForce 880OGTX, and compare the results with GRAPE-6Af special purpose hardware. The force evaluation the N-body problem was implemented in Cg using the GPU directly to speed-up the calculations. The integration of the equations motions were, running on the host computer, implemented in C using the 4th order predictor-corrector Hermite integrator with block.me steps. We find that for a large number of particles (N less than or similar to 104) modern graphics processing units offer an attractive low cost alteritive to GRAPE special purpose hardware. A modern GPU continues to give a relatively flat scaling with the number of particles, comtrable to that of the GRAPE. The GRAPE is designed to reach double precision, whereas the GPU is intrinsically single-precision. For latively large time steps, the total energy of the N-body system was conserved better than to one in 10(6) on the GPU, which is impressive ven the single-precision nature of the GPU. For the same time steps, the GRAPE gave somewhat more accurate results, by about an Aer of magnitude. However, smaller time steps allowed more energy accuracy on the grape, around 10(-11), whereas for the GPU achine precision saturates around 10(-6) For N greater than or similar to 10(6) the GeForce 880OGTX was about 20 times faster than the host computer. hough still about a factor of a few slower than GRAPE, modern GPUs outperform GRAPE in their low cost, long mean time between.ilure and the much larger onboard memory; the GRAPE-6Af holds at most 256k particles whereas the GeForce 880OGTX can hold 9.illion particles in memory. 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:641 / 650
页数:10
相关论文
共 40 条