Performance analysis of SSE and AVX instructions in multi-core CPUs and GPU computing on FDTD scheme for solid and fluid vibration problems

被引:5
作者
Frances, Jorge [1 ]
Bleda, Sergio [1 ]
Marquez, Andres [1 ]
Neipp, Cristian [1 ]
Gallego, Sergi [1 ]
Otero, Beatriz [2 ]
Belendez, Augusto [1 ,3 ]
机构
[1] Univ Alicante, Dept Fis Ingn Sistemas & Teoria Senal, E-03080 Alicante, Spain
[2] Univ Politecn Cataluna, Dept Arquitectura Comp, ES-08034 Barcelona, Spain
[3] Univ Alicante, Inst Univ Fis Aplicada Ciencias & Tecnol, E-03080 Alicante, Spain
关键词
FDTD; GPU; CPU; OpenMP; AVX; Vibration; SV-WAVE PROPAGATION; MAXWELLS EQUATIONS; MEDIA; SIMULATION;
D O I
10.1007/s11227-013-1065-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this work a unified treatment of solid and fluid vibration problems is developed by means of the Finite-Difference Time-Domain (FDTD). The scheme here proposed takes advantage from a scaling factor in the velocity fields that improves the performance of the method and the vibration analysis in heterogenous media. Moreover, the scheme has been extended in order to simulate both the propagation in porous media and the lossy solid materials. In order to accurately reproduce the interaction of fluids and solids in FDTD both time and spatial resolutions must be reduced compared with the set up used in acoustic FDTD problems. This aspect implies the use of bigger grids and hence more time and memory resources. For reducing the time simulation costs, FDTD code has been adapted in order to exploit the resources available in modern parallel architectures. For CPUs the implicit usage of the advanced vectorial extensions (AVX) in multi-core CPUs has been considered. In addition, the computation has been distributed along the different cores available by means of OpenMP directives. Graphic Processing Units have been also considered and the degree of improvement achieved by means of this parallel architecture has been compared with the highly-tuned CPU scheme by means of the relative speed up. The speed up obtained by the parallel versions implemented were up to 3 (AVX and OpenMP) and 40 (CUDA) times faster than the best sequential version for CPU that also uses OpenMP with auto-vectorization techniques, but non includes implicitely vectorial instructions. Results obtained with both parallel approaches demonstrate that massive parallel programming techniques are mandatory in solid-vibration problems with FDTD.
引用
收藏
页码:514 / 526
页数:13
相关论文
共 25 条
[1]  
BIOT MA, 1956, INT J ACOUST SOC AM, V28, P168
[2]  
BIOT MA, 1956, IEEE T GEOSCI REMOTE, V28, P179
[3]   Finite-difference time-domain simulation of low-frequency room acoustic problems [J].
Botteldooren, D .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1995, 98 (06) :3302-3308
[4]   FINITE-DIFFERENCE SIMULATION OF P-SV-WAVE PROPAGATION - A DISPLACEMENT-POTENTIAL APPROACH [J].
CAO, SH ;
GREENHALGH, S .
GEOPHYSICAL JOURNAL INTERNATIONAL, 1992, 109 (03) :525-535
[5]   Perfectly matched layers for elastodynamics: A new absorbing boundary condition [J].
Chew, WC ;
Liu, QH .
JOURNAL OF COMPUTATIONAL ACOUSTICS, 1996, 4 (04) :341-359
[6]   A 3D PERFECTLY MATCHED MEDIUM FROM MODIFIED MAXWELLS EQUATIONS WITH STRETCHED COORDINATES [J].
CHEW, WC ;
WEEDON, WH .
MICROWAVE AND OPTICAL TECHNOLOGY LETTERS, 1994, 7 (13) :599-604
[7]  
*CORP I, 2011, INTEL 64 AND IA 32 A
[8]   Performance analysis of the FDTD method applied to holographic volume gratings: Multi-core CPU versus GPU computing [J].
Frances, J. ;
Bleda, S. ;
Neipp, C. ;
Marquez, A. ;
Pascual, I. ;
Belendez, A. .
COMPUTER PHYSICS COMMUNICATIONS, 2013, 184 (03) :469-479
[9]   Analysis of periodic anisotropic media by means of split-field FDTD method and GPU computing [J].
Frances, J. ;
Bleda, S. ;
Alvarez, M. L. ;
Martinez, F. J. ;
Marquez, A. ;
Neipp, C. ;
Belendez, A. .
OPTICS AND PHOTONICS FOR INFORMATION PROCESSING VI, 2012, 8498
[10]  
FRANCES J, 2013, P INT C CMMSE, V2, P681