Fast Simulation of Conway's Game of Life Using Bitwise Parallel Bulk Computation on a GPU

被引：2

作者：

Fujita, Toru ^{[1
]}

Nakano, Koji ^{[1
]}

Ito, Yasuaki ^{[1
]}

机构：

[1] Hiroshima Univ, Dept Informat Engn, Kagamiyama 1-4-1, Higashihiroshima 7398527, Japan

来源：

INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE | 2016年 / 27卷 / 08期

关键词：

Cellular automaton; parallel algorithms; CUDA; GPGPU; MEMORY MACHINE; OFFLINE PERMUTATION; ALGORITHMS;

D O I：

10.1142/S0129054116500404

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Conway's Game of Life is the most well-known cellular automaton. The universe of the Came of Life is a 2-dimensional array of cells, each of which takes two possible states, alive or dead. The state of every cell is repeatedly updated according to those of eight neighbors. A. cell will be alive if exactly three neighbors are alive, or if it is alive and two neighbors are alive. The main contribution of this paper is to develop several acceleration techniques for simulating the Game of Life using a CPU as follows: (1) the states of 32/64 cells in 32/64-bit words (integers) and the next states are computed by the Bitwise Parallel Bulk Computation (BPBC) technique, (2) the states of cells stored in 2 words are updated at the same time by a thread, (3) warp shuffle instruction is used to directly transfer the current states stored in registers, and (4) multi-step simulation is performed to reduce the overhead of data transfer and invoking CUDA kernel. The experimental results show that, the performance of our GPU implementation using GeForce GTX TITAN X is 1350 x 10(9) updates per second for 16K-step simulation of 512K x 512K cells stored in the SSD. Since Intel Core i7 CPU using the same technique performs 13.4 x 10(9) updates per second, our CPU implementation for the Game of Life achieves a speedup factor of 100. Thus, these techniques work very efficiently on a CPU.

引用

页码：981 / 1003

页数：23

共 30 条

[21] Parallel Direct Simulation Monte Carlo Computation Using CUDA on GPUs
Su, C. -C.
Hsieh, C. -W.
Smith, M. R.
Jermy, M. C.
Wu, J. -S.
27TH INTERNATIONAL SYMPOSIUM ON RAREFIED GAS DYNAMICS, 2010, PTS ONE AND TWO, 2011, 1333 : 343 - +
[22] Simulation of disc-bulge-halo galaxies using parallel GPU based codes
Veles, O.
Berczik, P.
Just, A.
STAR CLUSTERS AND BLACK HOLES IN GALAXIES ACROSS COSMIC TIME, 2016, 10 (312): : 79 - 81
[23] The future of parallel computing: GPU vs CELL - General purpose planning against fast graphical computation architectures, which is the best solution for general purposes computation?
Bianchi, Luca
Gatti, Riccardo
Lombardi, Luca
GRAPP 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS THEORY AND APPLICATIONS, 2008, : 419 - 425
[24] Parallel SAX/GA for financial pattern matching using NVIDIA's GPU
Bauto, Joao
Canelas, Antonio
Neves, Rui
Horta, Nuno
EXPERT SYSTEMS WITH APPLICATIONS, 2018, 105 : 77 - 88
[25] Fast Parallel Markov Clustering in Bioinformatics Using Massively Parallel Computing on GPU with CUDA and ELLPACK-R Sparse Format
Bustamam, Alhadi
Burrage, Kevin
Hamilton, Nicholas A.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (03) : 679 - 692
[26] Parallelized computation for computer simulation of electrocardiograms using personal computers with multi-core CPU and general-purpose GPU
Shen, Wenfeng
Wei, Daming
Xu, Weimin
Zhu, Xin
Yuan, Shizhong
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2010, 100 (01) : 87 - 96
[27] Statically Optimal Binary Search Tree Computation Using Non-Serial Polyadic Dynamic Programming on GPU's
Wani, Mohsin Altaf
Ahmad, Manzoor
INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2019, 11 (01) : 49 - 70
[28] Using GPU's to Accelerate Stencil-based Computation Kernels for the Development of Large Scale Scientific Applications on Heterogeneous Systems
Tao, Jian
Blazewicz, Marek
Brandt, Steven R.
ACM SIGPLAN NOTICES, 2012, 47 (08) : 287 - 288
[29] Pattern classification using updated fuzzy hyper-line segment neural network and it's GPU parallel implementation for large datasets using CUDA
Dhabe, Priyadarshan
Vyas, Prashant
Ganeriwal, Devrat
Pathak, Aditya
2016 INTERNATIONAL CONFERENCE ON COMPUTING, ANALYTICS AND SECURITY TRENDS (CAST), 2016, : 24 - 29
[30] Simulation and experimental control of a 3-RPR parallel robot using optimal fuzzy controller and fast on/off solenoid valves based on the PWM wave
Moezi, Seyed Alireza
Rafeeyan, Mansour
Zakeri, Ehsan
Zare, Amin
ISA TRANSACTIONS, 2016, 61 : 265 - 286

← 1 2 3 →