GPU-warp based finite element matrices generation and assembly using coloring method

被引：25

作者：

Kiran, Utpal ^{[1
]}

Sharma, Deepak ^{[1
]}

Gautam, Sachin Singh ^{[1
]}

机构：

[1] Indian Inst Technol Guwahati, Dept Mech Engn, Gauhati 781039, Assam, India

来源：

JOURNAL OF COMPUTATIONAL DESIGN AND ENGINEERING | 2019年 / 6卷 / 04期

关键词：

Finite element method; Numerical integration; Assembly; GPU; CUDA; Coloring method; NUMERICAL-INTEGRATION; IMPLEMENTATION; ACCELERATION; SOLVERS; SYSTEM;

D O I：

10.1016/j.jcde.2018.11.001

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Finite element method has been successfully implemented on the graphics processing units to achieve a significant reduction in simulation time. In this paper, new strategies for the finite element matrix generation including numerical integration and assembly are proposed by using a warp per element for a given mesh. These strategies are developed using the well-known coloring method. The proposed strategies use a specialized algorithm to realize fine-grain parallelism and efficient use of on-chip memory resources. The warp shuffle feature of Compute Unified Device Architecture (CUDA) is used to accelerate numerical integration. The evaluation of elemental stiffness matrix is further optimized by adopting a partial parallel implementation of numerical integration. Performance evaluations of the proposed strategies are done for three-dimensional elasticity problem using the 8-noded hexahedral elements with three degrees of freedom per node. We obtain a speedup of up to 8.2x over the coloring based assembly by element strategy (using a single thread per element) on NVIDIA Tesla K40 GPU. Also, the proposed strategies achieve better arithmetic throughput and bandwidth. (C) 2018 Society for Computational Design and Engineering. Publishing Services by Elsevier.

引用

页码：705 / 718

页数：14

共 30 条

[1]

[Anonymous], HIGH PERFORMANCE COM

[2]

[Anonymous], INT J PARALLEL PROGR

[3]

[Anonymous], 2016, NVIDIA CUDA C PROGR

[4]

[Anonymous], 1993, An introduction to the finite element method

[5]

[Anonymous], CUD TOOLK DOC V8 0

[6] Numerical integration on GPUs for higher order finite elements [J].

Banas, Krzysztof ;

Plaszewski, Przemyskaw ;

Maciol, Pawel .

COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2014, 67 (06) :1319-1344

[7] Sparse matrix solvers on the GPU:: Conjugate gradients and multigrid [J].

Bolz, J ;

Farmer, I ;

Grinspun, E ;

Schröder, P .

ACM TRANSACTIONS ON GRAPHICS, 2003, 22 (03) :917-924

[8] The Magma algebra system .1. The user language [J].

Bosma, W ;

Cannon, J ;

Playoust, C .

JOURNAL OF SYMBOLIC COMPUTATION, 1997, 24 (3-4) :235-265

[9] A high performance crashworthiness simulation system based on GPU [J].

Cai, Yong ;

Wang, Guoping ;

Li, Guangyao ;

Wang, Hu .

ADVANCES IN ENGINEERING SOFTWARE, 2015, 86 :29-38

[10] Assembly of finite element methods on graphics processors [J].

Cecka, Cris ;

Lew, Adrian J. ;

Darve, E. .

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 2011, 85 (05) :640-669

← 1 2 3 →