Optimal Temporal Blocking for Stencil Computation

被引:17
作者
Muranushi, Takayuki [1 ]
Makino, Junichiro [1 ]
机构
[1] RIKEN AICS, Chuo Ku, 7-1-26 Minatojima Minami Machi, Kobe, Hyogo 6500047, Japan
来源
INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2015 COMPUTATIONAL SCIENCE AT THE GATES OF NATURE | 2015年 / 51卷
关键词
Parallel computation; Stencil computation; Optimization; LOCALITY;
D O I
10.1016/j.procs.2015.05.315
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Temporal blocking is a class of algorithms which reduces the required memory bandwidth (B/F ratio) of a given stencil computation, by "blocking" multiple time steps. In this paper, we prove that a lower limit exists for the reduction of the B/F attainable by temporal blocking, under certain conditions. We introduce the PiTCH tiling, an example of temporal blocking method that achieves the optimal B/F ratio. We estimate the performance of PiTCH tiling for various stencil applications on several modern CPUs. We show that PiTCH tiling achieves 1.5 similar to 2 times better B/F reduction in three-dimensional applications, compared to other temporal blocking schemes. We also show that PiTCH tiling can remove the bandwidth bottleneck from most of the stencil applications considered.
引用
收藏
页码:1303 / 1312
页数:10
相关论文
共 12 条
[1]  
[Anonymous], 2013, ICS 13
[2]  
Bandishti V., 2012, INT CONF HIGH PERFOR, P1, DOI DOI 10.1109/SC.2012.107
[3]   The memory behavior of cache oblivious stencil computations [J].
Frigo, Matteo ;
Strumpen, Volker .
JOURNAL OF SUPERCOMPUTING, 2007, 39 (02) :93-112
[4]  
Frigo Matteo, 2005, P 19 ANN INT C SUPER, P361, DOI DOI 10.1145/1088149.1088197
[5]  
LAM MS, 1991, SIGPLAN NOTICES, V26, P63, DOI 10.1145/106973.106981
[6]  
Malas Tareq, 2014, ARXIV14103060
[7]   Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines [J].
Ragan-Kelley, Jonathan ;
Barnes, Connelly ;
Adams, Andrew ;
Paris, Sylvain ;
Durand, Fredo ;
Amarasinghe, Saman .
ACM SIGPLAN NOTICES, 2013, 48 (06) :519-530
[8]   TILING MULTIDIMENSIONAL ITERATION SPACES FOR MULTICOMPUTERS [J].
RAMANUJAM, J ;
SADAYAPPAN, P .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1992, 16 (02) :108-120
[9]  
Strzodka Robert, 2010, 24th ACM International Conference on Supercomputing 2010, P49
[10]  
WOLF ME, 1991, SIGPLAN NOTICES, V26, P30