Recursive blocked algorithms for solving triangular systems -: Part I:: One-sided and coupled Sylvester-type matrix equations

被引:81
作者
Jonsson, I [1 ]
Kågström, B
机构
[1] Umea Univ, Dept Comp Sci, SE-90187 Umea, Sweden
[2] Umea Univ, HPC2N, SE-90187 Umea, Sweden
来源
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE | 2002年 / 28卷 / 04期
关键词
algorithms; performance; matrix equations; standard Sylvester and Lyapunov; generalized coupled Sylvester; recursion; automatic blocking; superscalar; GEMM-based; level-3; BLAS; SMP parallelization; LAPACK; SLICOT;
D O I
10.1145/592843.592845
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Triangular matrix equations appear naturally in estimating the condition numbers of matrix equations and different eigenspace computations, including block-diagonalization of matrices and matrix pairs and computation of functions of matrices. To solve a triangular matrix equation is also a major step in the classical Bartels-Stewart method for solving the standard continuous-time Sylvester equation (AX-XB=C). We present novel recursive blocked algorithms for solving one-sided triangular matrix equations, including the continuous-time Sylvester and Lyapunov equations, and a generalized coupled Sylvester equation. The main parts of the computations are performed as level-3 general matrix multiply and add (GEMM) operations. In contrast to explicit standard blocking techniques, our recursive approach leads to an automatic variable blocking that has the potential of matching the memory hierarchies of today's HPC systems. Different implementation issues are discussed, including when to terminate the recursion, the design of new optimized superscalar kernels for solving leaf-node triangular matrix equations efficiently, and how parallelism is utilized in our implementations. Uniprocessor and SMP parallel performance results of our recursive blocked algorithms and corresponding routines in the state-of-the-art libraries LAPACK and SLICOT are presented. The performance improvements of our recursive algorithms are remarkable, including 10-fold speedups compared to standard algorithms.
引用
收藏
页码:392 / 415
页数:24
相关论文
共 49 条
[1]  
Anderson E., 1999, LAPACK USERS GUIDE
[2]  
[Anonymous], 19996 SLICOT
[3]   ON COMPUTING CONDITION NUMBERS FOR THE NONSYMMETRIC EIGENPROBLEM [J].
BAI, Z ;
DEMMEL, J ;
MCKENNEY, A .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1993, 19 (02) :202-223
[4]   ALGORITHM - SOLUTION OF MATRIX EQUATION AX+XB = C [J].
BARTELS, RH ;
STEWART, GW .
COMMUNICATIONS OF THE ACM, 1972, 15 (09) :820-&
[6]   Blocked algorithms and software for reduction of a regular matrix pair to generalized schur form [J].
Dackland, K ;
Kågström, B .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1999, 25 (04) :425-454
[7]   A SET OF LEVEL 3 BASIC LINEAR ALGEBRA SUBPROGRAMS - MODEL IMPLEMENTATION AND TEST PROGRAMS [J].
DONGARRA, JJ ;
DUCROZ, J ;
HAMMARLING, S ;
DUFF, I .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1990, 16 (01) :18-28
[8]  
DONGARRA JJ, 1990, ACM T MATH SOFTWARE, V16, P1, DOI 10.1145/77626.79170
[9]  
Elmroth E, 2001, LECT NOTES COMPUT SC, V1947, P53
[10]   ALGORITHM-705 - A FORTRAN-77 SOFTWARE PACKAGE FOR SOLVING THE SYLVESTER MATRIX EQUATION AXB(T)+CXD(T)=E [J].
GARDINER, JD ;
WETTE, MR ;
LAUB, AJ ;
AMATO, JJ ;
MOLER, CB .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1992, 18 (02) :232-238