Optimizing Multi-grid Computation and Parallelization on Multi-cores

被引:4
作者
Yang, Xiaojian [1 ]
Li, Shengguo [1 ]
Yuan, Fan [2 ]
Dong, Dezun [1 ]
Huang, Chun [1 ]
Wang, Zheng [3 ]
机构
[1] Natl Univ Def Technol, Changsha, Peoples R China
[2] Xiangtan Univ, Xiangtan, Peoples R China
[3] Univ Leeds, Leeds, W Yorkshire, England
来源
PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ACM ICS 2023 | 2023年
基金
美国国家科学基金会; 国家重点研发计划;
关键词
Multigrid; symmetric Gauss-Seidel; Asynchronous parallelization; PERFORMANCE; EFFICIENT;
D O I
10.1145/3577193.3593726
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Multigrid algorithms are widely used to solve large-scale sparse linear systems, which is essential for many high-performance workloads. The symmetric Gauss-Seidel (SYMGS) method is often responsible for the performance bottleneck of MG. This paper presents new methods to parallelize and enhance the computation and parallelization efficiency of the SYMGS and MG algorithms on multi-core CPUs. Our solution employs a matrix splitting strategy and a revised computation formula to decrease the computation operations and memory accesses in SYMGS. With this new SYMGS strategy, we can then merge the two most time-consuming components of MG. On top of these, we propose a new asynchronous parallelization scheme to reduce the synchronization overhead when parallelizing SYMGS. We demonstrate the benefit of our techniques by integrating them with the HPCG benchmark and two real-life applications. Evaluation conducted on four architectures, including three ARMv8 and one x86, shows that our techniques greatly surpass the performance of engineer- and vendor-tuned implementations across various workloads and platforms.
引用
收藏
页码:227 / 239
页数:13
相关论文
共 62 条
[51]   On Optimizing Complex Stencils on GPUs [J].
Rawat, Prashant Singh ;
Vaidya, Miheer ;
Sukumaran-Rajam, Aravind ;
Rountev, Atanas ;
Pouchet, Louis-Noel ;
Sadayappan, P. .
2019 IEEE 33RD INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2019), 2019, :641-652
[52]  
Saad Y., 2003, SIAM, DOI
[53]  
Shijie Zhong., 1991, CitcomCU
[54]  
Trottenberg U., 2000, MULTIGRID
[55]  
Varga R.S., 1962, Iterative Analysis
[56]   Analysis of the Leaky Integrate-and-Fire neuron model for GPU implementation [J].
Venetis, Ioannis E. ;
Provata, Astero .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2022, 163 :1-19
[57]   LIBSHALOM: Optimizing Small and Irregular -Shaped Matrix Multiplications on ARMv8 Multi -Cores [J].
Yang, Weiling ;
Fang, Jianbin ;
Dong, Dezun ;
Su, Xing ;
Wang, Zheng .
SC21: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2021,
[58]  
Yu Xiaosong, 2020, IFIP INT C NETW PAR, V12639, P231, DOI 10.1007/978-3-030-79478-1_20
[59]   SPECTRAL MULTIGRID METHODS FOR ELLIPTIC-EQUATIONS [J].
ZANG, TA ;
WONG, YS ;
HUSSAINI, MY .
JOURNAL OF COMPUTATIONAL PHYSICS, 1982, 48 (03) :485-501
[60]  
Zhang XY, 2014, LECT NOTES COMPUT SC, V8630, P28, DOI 10.1007/978-3-319-11197-1_3