A Basic Linear Algebra Compiler for Structured Matrices

被引:21
作者
Spampinato, Daniele G. [1 ]
Pueschel, Markus [1 ]
机构
[1] Swiss Fed Inst Technol, Dept Comp Sci, Zurich, Switzerland
来源
PROCEEDINGS OF CGO 2016: THE 14TH INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION | 2016年
关键词
Program synthesis; Basic linear algebra; Structured matrices; DSL; Tiling; SIMD vectorization; GENERATION;
D O I
10.1145/2854038.2854060
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Many problems in science and engineering are in practice modeled and solved through matrix computations. Often, the matrices involved have structure such as symmetric or triangular, which reduces the operations count needed to perform the computation. For example, dense linear systems of equations are solved by first converting to triangular form and optimization problems may yield matrices with any kind of structure. The well-known BLAS (basic linear algebra sub-routine) interface provides a small set of structured matrix computations, chosen to serve a certain set of higher level functions (LAPACK). However, if a user encounters a computation or structure that is not supported, she loses the benefits of the structure and chooses a generic library. In this paper, we address this problem by providing a compiler that translates a given basic linear algebra computation on structured matrices into optimized C code, optionally vectorized with intrinsics. Our work combines prior work on the Spiral-like LGen compiler with techniques from polyhedral compilation to mathematically capture matrix structures. In the paper we consider upper/lower triangular and symmetric matrices but the approach is extensible to a much larger set including blocked structures. We run experiments on a modern Intel platform against the Intel MKL library and a baseline implementation showing competitive performance results for both BLAS and non-BLAS functionalities.
引用
收藏
页码:117 / 127
页数:11
相关论文
共 24 条
[1]  
ANDERSON E., 1999, LAPACK USERSGUIDE, V3rd
[2]  
[Anonymous], 2011, ENCY PARALLEL COMPUT
[3]   Code generation in the polyhedral model is easier than you think [J].
Bastoul, C .
13TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION TECHNIQUES, PROCEEDINGS, 2004, :7-16
[4]  
Beaugnon U, 2014, ACM SIGPLAN NOTICES, V49, P115, DOI [10.1145/2597809.2597818, 10.1145/2666357.2597818]
[5]   A Practical Automatic Polyhedral Parallelizer and Locality Optimizer [J].
Bondhugula, Uday ;
Hartono, Albert ;
Ramanujam, J. ;
Sadayappan, R. .
PLDI'08: PROCEEDINGS OF THE 2008 SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN & IMPLEMENTATION, 2008, :101-+
[6]  
DONGARRA JJ, 1990, ACM T MATH SOFTWARE, V16, P1, DOI 10.1145/77626.79170
[7]  
Fabregat-Traver D, 2013, LECT NOTES COMPUT SC, V7851, P346, DOI 10.1007/978-3-642-38718-0_33
[8]  
Feautrier Paul, 2011, ENCY PARALLEL COMPUT
[9]  
Franchetti F, 2009, LECT NOTES COMPUT SC, V5658, P385, DOI 10.1007/978-3-642-03034-5_18
[10]   Anatomy of high-performance matrix multiplication [J].
Goto, Kazushige ;
Van De Geijn, Robert A. .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2008, 34 (03)