GPU-ACCELERATED DISCONTINUOUS GALERKIN METHODS ON POLYTOPIC MESHES

被引：5

作者：

Dong, Zhaonan ^{[1
,2
]}

Georgoulis, Emmanuil H. ^{[3
,4
,5
]}

Kappas, Thomas ^{[3
]}

机构：

[1] INRIA, F-75589 Paris, France

[2] Ecole Ponts, CERMICS, F-77455 Marne La Vallee 2, France

[3] Univ Leicester, Sch Math & Actuarial Sci, Leicester LE1 7RH, Leics, England

[4] Natl Tech Univ Athens, Sch Appl Math & Phys Sci, Dept Math, Zografos 15780, Greece

[5] IACM FORTH, Iraklion, Crete, Greece

来源：

SIAM JOURNAL ON SCIENTIFIC COMPUTING | 2021年 / 43卷 / 04期

基金：

英国工程与自然科学研究理事会;

关键词：

discontinuous Galerkin; GPU; polytopic meshes; high order methods; NUMERICAL-INTEGRATION; PARABOLIC PROBLEMS; TIME; EQUATIONS; POLYGONS; CONVEX;

D O I：

10.1137/20M1350984

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

Discontinuous Galerkin (dG) methods on meshes consisting of polygonal/polyhedral (henceforth, collectively termed as polytopic) elements have received considerable attention in recent years. Due to the physical frame basis functions used typically and the quadrature challenges involved, the matrix-assembly step for these methods is often computationally cumbersome. To address this important practical issue, this work proposes two parallel assembly implementation algorithms on Compute Unified Device Architecture-enabled graphics cards for the interior penalty dG method on polytopic meshes for various classes of linear PDE problems. We are concerned with both single graphics processing unit (GPU) parallelization, as well as with implementation on distributed GPU nodes. The results included showcase almost linear scalability of the quadrature step with respect to the number of GPU cores used since no communication is needed for the assembly step. In turn, this can justify the claim that polytopic dG methods can be implemented extremely efficiently, as any assembly computing time overhead compared to finite elements on "standard"" simplicial or box-type meshes can be effectively circumvented by the proposed algorithms.

引用

页码：C312 / C334

页数：23

共 35 条

[21]

Hesthaven J.S., 2008, ALGORITHMS ANAL APPL, V54

[22]

Houston P., SIAM J NUMER ANAL, P2133

[23] GALERKIN-TYPE APPROXIMATIONS WHICH ARE DISCONTINUOUS IN TIME FOR PARABOLIC EQUATIONS IN A VARIABLE DOMAIN [J].

JAMET, P .

SIAM JOURNAL ON NUMERICAL ANALYSIS, 1978, 15 (05) :912-928

[24] A GPU accelerated discontinuous Galerkin incompressible flow solver [J].

Karakus, A. ;

Chalmers, N. ;

Swirydowicz, K. ;

Warburton, T. .

JOURNAL OF COMPUTATIONAL PHYSICS, 2019, 390 :380-404

[25] A GPU-accelerated adaptive discontinuous Galerkin method for level set equation [J].

Karakus, A. ;

Warburton, T. ;

Aksel, M. H. ;

Sert, C. .

INTERNATIONAL JOURNAL OF COMPUTATIONAL FLUID DYNAMICS, 2016, 30 (01) :56-68

[26] A fast and high quality multilevel scheme for partitioning irregular graphs [J].

Karypis, G ;

Kumar, V .

SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1998, 20 (01) :359-392

[27] Nodal discontinuous Galerkin methods on graphics processors [J].

Kloeckner, A. ;

Warburton, T. ;

Bridge, J. ;

Hesthaven, J. S. .

JOURNAL OF COMPUTATIONAL PHYSICS, 2009, 228 (21) :7863-7882

[28] A High-Performance Low-Power H.264/AVC Video Decoder Accelerator for Embedded Systems [J].

Kuo, Huang-Chih ;

Chen, Jian-Wen ;

Lin, Youn-Long .

2009 IEEE/ACM/IFIP 7TH WORKSHOP ON EMBEDDED SYSTEMS FOR REAL-TIME MULTIMEDIA, 2009, :1-8

[29] Integration on a convex polytope [J].

Lasserre, JB .

PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY, 1998, 126 (08) :2433-2441

[30] Finite element assembly strategies on multi-core and many-core architectures [J].

Markall, G. R. ;

Slemmer, A. ;

Ham, D. A. ;

Kelly, P. H. J. ;

Cantwell, C. D. ;

Sherwin, S. J. .

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, 2013, 71 (01) :80-97

← 1 2 3 4 →