High Performance Stencil Code Generation with LIFT

被引:75
作者
Hagedorn, Bastian [1 ]
Stoltzfus, Larisa [2 ]
Steuwer, Michel [3 ]
Gorlatch, Sergei [1 ]
Dubach, Christophe [2 ]
机构
[1] Univ Munster, Munster, Germany
[2] Univ Edinburgh, Edinburgh, Midlothian, Scotland
[3] Univ Glasgow, Glasgow, Lanark, Scotland
来源
PROCEEDINGS OF THE 2018 INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO'18) | 2018年
基金
英国工程与自然科学研究理事会;
关键词
Code Generation; Stencil; GPU Computing; Performance Portability; Lift;
D O I
10.1145/3168824
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Stencil computations are widely used from physical simulations to machine-learning. They are embarrassingly parallel and perfectly fit modern hardware such as Graphic Processing Units. Although stencil computations have been extensively studied, optimizing them for increasingly diverse hardware remains challenging. Domain Specific Languages (DSLs) have raised the programming abstraction and offer good performance. However, this places the burden on DSL implementers who have to write almost full-fledged parallelizing compilers and optimizers. LIFT has recently emerged as a promising approach to achieve performance portability and is based on a small set of reusable parallel primitives that DSL or library writers can build upon. LIFT'S key novelty is in its encoding of optimizations as a system of extensible rewrite rules which are used to explore the optimization space. However, LIFT has mostly focused on linear algebra operations and it remains to be seen whether this approach is applicable for other domains. This paper demonstrates how complex multidimensional stencil code and optimizations such as tiling are expressible using compositions of simple 1D LIFT primitives. By leveraging existing LIFT primitives and optimizations, we only require the addition of two primitives and one rewrite rule to do so. Our results show that this approach outperforms existing compiler approaches and hand-tuned codes.
引用
收藏
页码:100 / 112
页数:13
相关论文
共 52 条
[1]  
[Anonymous], P INT C HIGH PERF CO
[2]  
[Anonymous], 2010, 2010 IEEE INT S PAR
[3]  
[Anonymous], 2013, ICS 13
[4]  
[Anonymous], ACM SIGPLAN
[5]  
[Anonymous], 2006, Tech. rep.
[6]   OpenTuner: An Extensible Framework for Program Autotuning [J].
Ansel, Jason ;
Kamil, Shoaib ;
Veeramachaneni, Kalyan ;
Ragan-Kelley, Jonathan ;
Bosboom, Jeffrey ;
O'Reilly, Una-May ;
Amarasinghe, Saman .
PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, :303-315
[7]  
Aumage Olivier, 2016, SYCL 2016 WORKSH ACM
[8]  
Bastian P., 2006, P 19 S SIM TECHN
[9]  
Brandvik Tobias, 2010, Proceedings of the 2010 IEEE 10th International Conference on Computer and Information Technology (CIT 2010), P1181, DOI 10.1109/CIT.2010.214
[10]  
Che SA, 2009, I S WORKL CHAR PROC, P44, DOI 10.1109/IISWC.2009.5306797