Symbolic and Numeric Kernel Division for Graphics Processing Unit-Based Finite Element Analysis Assembly of Regular Meshes With Modified Sparse Storage Formats

被引：4

作者：

Sanfui, Subhajit ^{[1
]}

Sharma, Deepak ^{[1
]}

机构：

[1] Indian Inst Technol, Dept Mech Engn, Gauhati 781039, Assam, India

来源：

JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING | 2022年 / 22卷 / 01期

关键词：

FEA; GPU computing; assembly methods; sparse storage; GPU; GENERATION; MATRICES;

D O I：

10.1115/1.4051123

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

This paper presents an efficient strategy to perform the assembly stage of finite element analysis (FEA) on general purpose graphics processing units (GPUs). This strategy involves dividing the assembly task using symbolic and numeric kernels, and thereby reducing the complexity of the standard single-kernel assembly approach. Two sparse storage formats based on the proposed strategy are also developed by modifying the existing sparse storage formats with the intention of removing the degrees-of-freedom-based redundancies in the global matrix. The inherent problem of race condition is resolved through the implementation of coloring and atomics. The proposed strategy is compared with the state-of-the-art GPU-based and central processing unit (CPU)-based assembly techniques. These comparisons reveal a significant number of benefits in terms of reducing storage space requirements and execution time and increasing performance (GFLOPS). Moreover, using the proposed strategy, it is found that the coloring method is more effective compared to the atomics-based method for the existing as well as the modified storage formats.

引用

页数：12

共 34 条

[11] Parallel implementation of implicit finite element model with cohesive zones and collision response using CUDA [J].

Gribanov, Igor ;

Taylor, Rocky ;

Sarracino, Robert .

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 2018, 115 (07) :771-790

[12] Topology optimization design of 3D electrothermomechanical actuators by using GPU as a co-processor [J].

Javier Ramirez-Gil, Francisco ;

Nelli Silva, Emilio Carlos ;

Montealegre-Rubio, Wilfredo .

COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2016, 302 :44-69

[13] A fast and high quality multilevel scheme for partitioning irregular graphs [J].

Karypis, G ;

Kumar, V .

SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1998, 20 (01) :359-392

[14] GPU-based matrix-free finite element solver exploiting symmetry of elemental matrices [J].

Kiran, Utpal ;

Gautam, Sachin Singh ;

Sharma, Deepak .

COMPUTING, 2020, 102 (09) :1941-1965

[15] GPU-warp based finite element matrices generation and assembly using coloring method [J].

Kiran, Utpal ;

Sharma, Deepak ;

Gautam, Sachin Singh .

JOURNAL OF COMPUTATIONAL DESIGN AND ENGINEERING, 2019, 6 (04) :705-718

[16] Parallel Realization of the Element-by-Element FEM Technique by CUDA [J].

Kiss, Imre ;

Gyimothy, Szabolcs ;

Badics, Zsolt ;

Pavo, Jozsef .

IEEE TRANSACTIONS ON MAGNETICS, 2012, 48 (02) :507-510

[17]

Knepley M.G., 2011, ABS11030066 CORR

[18] Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA [J].

Komatitsch, Dimitri ;

Michea, David ;

Erlebacher, Gordon .

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2009, 69 (05) :451-460

[19] Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation [J].

Kreutzer, Moritz ;

Hager, Georg ;

Wellein, Gerhard ;

Fehske, Holger ;

Basermann, Achim ;

Bishop, Alan R. .

2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, :1696-1702

[20] Optimization and acceleration of flow simulations for CFD on CPU/GPU architecture [J].

Lei, Jiang ;

Li, Da-li ;

Zhou, Yun-long ;

Liu, Wei .

JOURNAL OF THE BRAZILIAN SOCIETY OF MECHANICAL SCIENCES AND ENGINEERING, 2019, 41 (07)

← 1 2 3 4 →