Towards a Domain-Extensible Compiler: Optimizing an Image Processing Pipeline on Mobile CPUs

被引:3
作者
Koehler, Thomas [1 ]
Steuwer, Michel [2 ]
机构
[1] Univ Glasgow, Glasgow, Lanark, Scotland
[2] Univ Edinburgh, Edinburgh, Midlothian, Scotland
来源
CGO '21: PROCEEDINGS OF THE 2021 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO) | 2021年
关键词
Code generation; Compilers; Performance; Image Processing; Rise; Elevate;
D O I
10.1109/CGO51591.2021.9370337
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Halide and many similar projects have demonstrated the great potential of domain specific optimizing compilers. They enable programs to be expressed at a convenient high-level, while generating high-performance code for parallel architectures. As domains of interest expand towards deep learning, probabilistic programming and beyond, it becomes increasingly clear that it is unsustainable to redesign domain specific compilers for each new domain. In addition, the rapid growth of hardware architectures to optimize for poses great challenges for designing these compilers. In this paper, we show how to extend a unifying domain-extensible compiler with domain-specific as well as hardware-specific optimizations. The compiler operates on generic patterns that have proven flexible enough to express a wide range of computations. Optimizations are not hard-coded into the compiler but are expressed as user-defined rewrite rules that are composed into strategies controlling the optimization process. Crucially, both computational patterns and optimization strategies are extensible without modifying the core compiler implementation. We demonstrate that this domain-extensible compiler design is capable of expressing image processing pipelines and well-known image processing optimizations. Our results on four mobile ARM multi-core CPUs, often used for image processing tasks, show that the code generated for the Harris operator outperforms the image processing library OpenCV by up to 16x and achieves performance close to - or even up to 1.4 x better than - the state-of-the-art image processing compiler Halide.
引用
收藏
页码:27 / 38
页数:12
相关论文
共 29 条
  • [1] [Anonymous], 2005, P 19 ANN INT C SUP I, DOI DOI 10.1145/1088149.1088169
  • [2] Areekul V, 2005, IEEE IMAGE PROC, P2913
  • [3] Baghdadi R., 2018, TIRAMISU POLYHEDRAL
  • [4] Code generation in the polyhedral model is easier than you think
    Bastoul, C
    [J]. 13TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION TECHNIQUES, PROCEEDINGS, 2004, : 7 - 16
  • [5] A Domain-Specific Approach To Heterogeneous Parallelism
    Chafi, Hassan
    Sujeeth, Arvind K.
    Brown, Kevin J.
    Lee, HyoukJoong
    Atreya, Anand R.
    Olukotun, Kunle
    [J]. ACM SIGPLAN NOTICES, 2011, 46 (08) : 35 - 45
  • [6] Chen TQ, 2018, PROCEEDINGS OF THE 13TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P579
  • [7] Dubach C., 2017, STRATEGY PRESERVING
  • [8] Diesel: DSL for Linear Algebra and Neural Net Computations on GPUs
    Elango, Venmugil
    Rubin, Norm
    Ravishankar, Mahesh
    Sandanagobalane, Hariharan
    Grover, Vinod
    [J]. MAPL'18: PROCEEDINGS OF THE 2ND ACM SIGPLAN INTERNATIONAL WORKSHOP ON MACHINE LEARNING AND PROGRAMMING LANGUAGES, 2018, : 42 - 51
  • [9] Fast anisotropic Gauss filtering
    Geusebroek, JM
    Smeulders, AWM
    van de Weijer, J
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2003, 12 (08) : 938 - 943
  • [10] Hagedorn B., 2020, ARXIV PREPRINT ARXIV