Improving the arithmetic intensity of multigrid with the help of polynomial smoothers

被引:20
作者
Ghysels, P. [1 ,2 ]
Klosiewicz, P. [1 ,2 ]
Vanroose, W. [1 ]
机构
[1] Univ Antwerp, Dept Math & Comp Sci, B-2020 Antwerp, Belgium
[2] Intel ExaSci Lab, B-3001 Louvain, Belgium
基金
美国国家科学基金会;
关键词
multigrid; Chebyshev iteration; arithmetic intensity; bound by memory bandwidth; STENCIL COMPUTATIONS;
D O I
10.1002/nla.1808
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The basic building blocks of a classic multigrid algorithm, which are essentially stencil computations, all have a low ratio of executed floating point operations per byte fetched from memory. This important ratio can be identified as the arithmetic intensity. Applications with a low arithmetic intensity are typically bounded by memory traffic and achieve only a small percentage of the theoretical peak performance of the underlying hardware. We propose a polynomial Chebyshev smoother, which we implement using cache-aware tiling, to increase the arithmetic intensity of a multigrid V-cycle. This tiling approach involves a trade-off between redundant computations and cache misses. Unlike common conception, we observe optimal performance for higher degrees of the smoother. The higher-degree polynomial Chebyshev smoother can be used to smooth more than just the upper half of the error frequencies, leading to better V-cycle convergence rates. Smoothing more than the upper half of the error spectrum allows a more aggressive coarsening approach where some levels in the multigrid hierarchy are skipped. Copyright (C) 2012 John Wiley & Sons, Ltd.
引用
收藏
页码:253 / 267
页数:15
相关论文
共 35 条
[1]   Parallel multigrid smoothing: polynomial versus Gauss-Seidel [J].
Adams, M ;
Brezina, M ;
Hu, J ;
Tuminaro, R .
JOURNAL OF COMPUTATIONAL PHYSICS, 2003, 188 (02) :593-610
[2]  
[Anonymous], LECT NOTES COMPUT SC
[3]  
[Anonymous], 3 5 D BLOCKING OPTIM
[4]  
[Anonymous], UTCS97366
[5]  
[Anonymous], 1994, TEMPLATES SOLUTION L, DOI DOI 10.1137/1.9781611971538
[6]  
[Anonymous], 2008, EXASCALE COMPUTING S
[7]  
[Anonymous], INT J PARALLEL EMERG
[8]  
[Anonymous], 2003, ITERATIVE METHODS SP, DOI DOI 10.1137/1.9780898718003
[9]  
Ashby StevenF., 1985, Chebycode, a fortran implementation of manteuffel's adaptive chebyshev algorithm
[10]  
Baskaran M.M., 2008, OPTIMIZING SPARSE MA