COMMUNICATION-OPTIMAL PARALLEL AND SEQUENTIAL QR AND LU FACTORIZATIONS

被引:209
作者
Demmel, James [1 ]
Grigori, Laura [2 ]
Hoemmen, Mark [3 ]
Langou, Julien [4 ,5 ]
机构
[1] Univ Calif Berkeley, EECS, Berkeley, CA 94720 USA
[2] Univ Paris 11, INRIA Saclay Ile France, Lab Rech Informat, F-91405 Orsay, France
[3] Sandia Natl Labs, Albuquerque, NM 87185 USA
[4] Univ Colorado Denver, Dept Math Sci, Denver, CO 80202 USA
[5] Hlth Sci Ctr, Denver, CO 80202 USA
基金
美国国家科学基金会;
关键词
linear algebra; QR factorization; LU factorization; DECOMPOSITION; COMPLEXITY; ALGORITHM; SERIAL;
D O I
10.1137/080731992
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We present parallel and sequential dense QR factorization algorithms that are both optimal (up to polylogarithmic factors) in the amount of communication they perform and just as stable as Householder QR. We prove optimality by deriving new lower bounds for the number of multiplications done by "non-Strassen-like" QR, and using these in known communication lower bounds that are proportional to the number of multiplications. We not only show that our QR algorithms attain these lower bounds (up to polylogarithmic factors), but that existing LAPACK and ScaLAPACK algorithms perform asymptotically more communication. We derive analogous communication lower bounds for LU factorization and point out recent LU algorithms in the literature that attain at least some of these lower bounds. The sequential and parallel QR algorithms for tall and skinny matrices lead to significant speedups in practice over some of the existing algorithms, including LAPACK and ScaLAPACK, for example, up to 6.7 times over ScaLAPACK. A performance model for the parallel algorithm for general rectangular matrices predicts significant speedups over ScaLAPACK.
引用
收藏
页码:A206 / A239
页数:34
相关论文
共 49 条
  • [1] ANDERSON E., 1999, LAPACK USERSGUIDE, V3rd
  • [2] [Anonymous], P ACM IEEE SC08 C
  • [3] [Anonymous], 1981, P 13 ANN ACM S THEOR, DOI DOI 10.1145/800076.802486
  • [4] [Anonymous], 2001, PPSC
  • [5] BABOULIN M., 2006, UTCS06582
  • [6] Algorithm 827: irbleigs: a MATLAB program for computing a few eigenpairs of a large sparse Hermitian matrix
    Baglama, J
    Calvetti, D
    Reichel, L
    [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2003, 29 (03): : 337 - 348
  • [7] BAI Z., 2000, TEMPLATES SOLUTION A, P196
  • [8] BAKER C.G., ANASAZI WEBPAGE
  • [9] BALLARD G., 2011, UCBEECS201115
  • [10] MINIMIZING COMMUNICATION IN NUMERICAL LINEAR ALGEBRA
    Ballard, Grey
    Demmel, James
    Holtz, Olga
    Schwartz, Oded
    [J]. SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2011, 32 (03) : 866 - 901