Two-level parallel load balancing strategy for accelerating DSMC simulations in near-continuum gases

被引:0
|
作者
Xiao, Chenxiang [1 ]
Zhang, Chenchen [2 ]
Zhang, Bin [1 ,3 ]
Xu, Hui [1 ]
Liu, Hong [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Aeronaut & Astronaut, 800 Dong Chuan Rd, Shanghai 200240, Peoples R China
[2] Peking Univ, Sch Math Sci, Beijing 100871, Peoples R China
[3] Shanghai Jiao Tong Univ, Sichuan Res Inst, Chengdu 610213, Peoples R China
来源
INTERNATIONAL JOURNAL OF MODERN PHYSICS C | 2025年 / 36卷 / 03期
关键词
DSMC; MPI/OpenMP; load balance; nonblock communication; MONTE-CARLO METHOD; HYPERSONIC FLOW; NUMERICAL-SIMULATION; CIRCULAR-CYLINDER; IMPLEMENTATION;
D O I
10.1142/S0129183124501985
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The Direct Simulation Monte Carlo (DSMC) algorithm is widely employed for simulating rarefied gas flows and is increasingly applied in near-continuum regimes for research and engineering purposes. However, its computational demands, notably load imbalance and extended simulation time, hinder widespread adoption. Addressing these challenges, this paper introduces the Two-Level parallel load balancing strategy. This novel approach combines thread-level and multi-process parallelism to enhance load balancing and reduce simulation time. Key features include a thread-level load-decoupling strategy implemented via OpenMP and a multi-process load balancing mechanism employing distributed memory via MPI. Building upon our previous PartPlusColl [L. Li, W. Ren and B. Zhang, J. Aeronaut. Astronaut. Aviat. Ser. A 46, 88 (2014)] approach, the load balancing mechanism utilizes Stop At Risk (SAR) criteria for repartitioning with METIS. Additionally, a specialized data transmission mechanism utilizing MPI nonblocking communication minimizes global communication between processes. Validation and evaluation are performed using four hypersonic flow cases around a cylinder and sphere, demonstrating significant improvements. Notably, the proposed strategy achieves 30% enhancement over the PartPlusColl strategy under 512 CPU cores compared to 16 CPU cores, and reduces between-process communication time with 33.57%. These advancements contribute to enhancing the effectiveness of the DSMC algorithm in near-continuum aerodynamic simulations.
引用
收藏
页数:17
相关论文
共 5 条
  • [1] A load-decoupling parallel strategy based on shared memory architecture for DSMC to simulate near-continuum gases
    Zhang, Chenchen
    Wen, MinHua
    Zhang, Bin
    Lin, James
    Liu, Hong
    COMPUTER PHYSICS COMMUNICATIONS, 2022, 279
  • [2] A DSMC program that supports different heterogeneous systems to simulate near-continuum gases
    Zhang, Chenchen
    Wen, MinHua
    Zhang, Bin
    Lin, James
    Liu, Hong
    INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2022, 33 (11):
  • [3] Simulations of subsonic vortex-shedding flow past a 2D vertical plate in the near-continuum regime by the parallelized DSMC code
    Tseng, K. C.
    Kuo, T. C.
    Lin, S. C.
    Su, C. C.
    Wu, J. S.
    COMPUTER PHYSICS COMMUNICATIONS, 2012, 183 (08) : 1596 - 1608
  • [4] Power System Dynamic Simulations Using a Parallel Two-Level Schur-Complement Decomposition
    Aristidou, Petros
    Lebeau, Simon
    Van Cutsem, Thierry
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2016, 31 (05) : 3984 - 3995
  • [5] Two-level dynamic load-balanced p-adaptive discontinuous Galerkin methods for CFD simulations
    Jang, Yongseok
    Martin, Emeric
    Chapelier, Jean-Baptiste
    Couaillier, Vincent
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2024, 176 : 165 - 178