When Polyhedral Transformations Meet SIMD Code Generation

被引:62
作者
Kong, Martin [1 ]
Veras, Richard [2 ]
Stock, Kevin [1 ]
Franchetti, Franz [2 ]
Pouchet, Louis-Noel [3 ]
Sadayappan, P. [1 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Univ Calif Los Angeles, Los Angeles, CA 90024 USA
关键词
Algorithms; Performance; Compiler Optimization; Loop Transformations; Affine Scheduling; Program synthesis; Autotuning; LOOP TRANSFORMATIONS;
D O I
10.1145/2499370.2462187
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Data locality and parallelism are critical optimization objectives for performance on modern multi-core machines. Both coarse-grain parallelism (e.g., multi-core) and fine-grain parallelism (e.g.,vector SIMD) must be effectively exploited, but despite decades of progress at both ends, current compiler optimization schemes that attempt to address data locality and both kinds of parallelism often fail at one of the three objectives. We address this problem by proposing a 3-step framework,which aims for integrated data locality, multi-core parallelism and SIMD execution of programs. We define the concept of vectorizable codelets, with properties tailored to achieve effective SIMD code generation for the codelets. We leverage the power of a modern high-level transformation framework to restructure a program to expose good ISA-independent vectorizable codelets, exploiting multi-dimensional data reuse. Then, we generate ISA-specific customized code for the codelets, using a collection of lower-level SIMD-focused optimizations. We demonstrate our approach on a collection of numerical kernels that we automatically tile, parallelize and vectorize, exhibiting significant performance improvements over existing compilers.
引用
收藏
页码:127 / 138
页数:12
相关论文
共 41 条
  • [1] [Anonymous], NATO ASI SERIES
  • [2] [Anonymous], P INT C COMP ENG SYS
  • [3] [Anonymous], P SUP
  • [4] [Anonymous], 2000, Generative Programming: Methods, Tools, and Applications
  • [5] Bandishti V., 2012, ACM IEEE C SUP SC 12
  • [6] Baskaran M.M., 2010, CGO
  • [7] Code generation in the polyhedral model is easier than you think
    Bastoul, C
    [J]. 13TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION TECHNIQUES, PROCEEDINGS, 2004, : 7 - 16
  • [8] Bastoul C, 2004, LECT NOTES COMPUT SC, V3149, P272
  • [9] Achieving extensibility through product-lines and domain-specific languages: A case study
    Batory, D
    Johnson, C
    MacDonald, B
    Von Heeder, D
    [J]. ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2002, 11 (02) : 191 - 214
  • [10] BATORY D, 2002, P AUT SOFTW ENG C AS