Efficient parallelization for AMR MHD multiphysics calculations; implementation in AstroBEAR

被引：42

作者：

Carroll-Nellenback, Jonathan J. ^{[1
]}

Shroyer, Brandon ^{[1
]}

Frank, Adam ^{[1
]}

Ding, Chen ^{[2
]}

机构：

[1] Univ Rochester, Dept Phys & Astron, Rochester, NY 14627 USA

[2] Univ Rochester, Dept Comp Sci, Rochester, NY 14627 USA

来源：

JOURNAL OF COMPUTATIONAL PHYSICS | 2013年 / 236卷

基金：

美国国家科学基金会;

关键词：

Adaptive mesh refinement; Parallel; High performance computing; Distributed tree; Threading; ADAPTIVE MESH REFINEMENT; ALGORITHMS;

D O I：

10.1016/j.jcp.2012.10.004

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Current adaptive mesh refinement (AMR) simulations require algorithms that are highly parallelized and manage memory efficiently. As compute engines grow larger, AMR simulations will require algorithms that achieve new levels of efficient parallelization and memory management. We have attempted to employ new techniques to achieve both of these goals. Patch or grid based AMR often employs ghost cells to decouple the hyperbolic advances of each grid on a given refinement level. This decoupling allows each grid to be advanced independently. In AstroBEAR we utilize this independence by threading the grid advances on each level with preference going to the finer level grids. This allows for global load balancing instead of level by level load balancing and allows for greater parallelization across both physical space and AMR level. Threading of level advances can also improve performance by interleaving communication with computation, especially in deep simulations with many levels of refinement. While we see improvements of up to 30% on deep simulations run on a few cores, the speedup is typically more modest (5-20%) for larger scale simulations. To improve memory management we have employed a distributed tree algorithm that requires processors to only store and communicate local sections of the AMR tree structure with neighboring processors. Using this distributed approach we are able to get reasonable scaling efficiency (>80%) out to 12288 cores and up to 8 levels of AMR - independent of the use of threading. (c) 2012 Elsevier Inc. All rights reserved.

引用

页码：461 / 476

页数：16

共 13 条

[1] Divergence-free adaptive mesh refinement for magnetohydrodynamics [J].

Balsara, DS .

JOURNAL OF COMPUTATIONAL PHYSICS, 2001, 174 (02) :614-648

[2] LOCAL ADAPTIVE MESH REFINEMENT FOR SHOCK HYDRODYNAMICS [J].

BERGER, MJ ;

COLELLA, P .

JOURNAL OF COMPUTATIONAL PHYSICS, 1989, 82 (01) :64-84

[3] ADAPTIVE MESH REFINEMENT FOR HYPERBOLIC PARTIAL-DIFFERENTIAL EQUATIONS [J].

BERGER, MJ ;

OLIGER, J .

JOURNAL OF COMPUTATIONAL PHYSICS, 1984, 53 (03) :484-512

[4]

Colella Phil, PETASCALE BLOCK STRU

[5] SIMULATING MAGNETOHYDRODYNAMICAL FLOW WITH CONSTRAINED TRANSPORT AND ADAPTIVE MESH REFINEMENT: ALGORITHMS AND TESTS OF THE AstroBEAR CODE [J].

Cunningham, Andrew J. ;

Frank, Adam ;

Varniere, Peggy ;

Mitran, Sorin ;

Jones, Thomas W. .

ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 2009, 182 (02) :519-542

[6] Efficient parallel algorithms and software for compressed octrees with applications to hierarchical methods [J].

Hariharan, B ;

Aluru, S .

PARALLEL COMPUTING, 2005, 31 (3-4) :311-331

[7] Fully threaded tree algorithms for adaptive refinement fluid dynamics simulations [J].

Khokhlov, AM .

JOURNAL OF COMPUTATIONAL PHYSICS, 1998, 143 (02) :519-543

[8] PARAMESH: A parallel adaptive mesh refinement community toolkit [J].

MacNeice, P ;

Olson, KM ;

Mobarry, C ;

de Fainchtein, R ;

Packer, C .

COMPUTER PHYSICS COMMUNICATIONS, 2000, 126 (03) :330-354

[9] Improving memory hierarchy performance for irregular applications using data and computation reorderings [J].

Mellor-Crummey, J ;

Whalley, D ;

Kennedy, K .

INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2001, 29 (03) :217-247

[10]

O'Shea B.W., 2004, Introducing Enzo, an AMR Cosmology Application

← 1 2 →