Distributed-memory lattice H-matrix factorization

被引:12
|
作者
Yamazaki, Ichitaro [1 ]
Ida, Akihiro [2 ]
Yokota, Rio [3 ]
Dongarra, Jack [4 ]
机构
[1] Sandia Natl Labs, Comp Sci Res Inst, Albuquerque, NM 87123 USA
[2] Univ Tokyo, Informat Technol Ctr, Supercomp Res Div, Tokyo, Japan
[3] Tokyo Inst Technol, Global Sci Informat & Comp Ctr, Tokyo, Japan
[4] Univ Tennessee, Dept Elect Engn & Comp Sci, Knoxville, TN USA
基金
美国国家科学基金会;
关键词
boundary element method; LU factorization; distributed memory; hierarchical matrix; task programming; APPROXIMATION;
D O I
10.1177/1094342019861139
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We parallelize the LU factorization of a hierarchical low-rank matrix (H-matrix) on a distributed-memory computer. This is much more difficult than the H-matrix-vector multiplication due to the dataflow of the factorization, and it is much harder than the parallelization of a dense matrix factorization due to the irregular hierarchical block structure of the matrix. Block low-rank (BLR) format gets rid of the hierarchy and simplifies the parallelization, often increasing concurrency. However, this comes at a price of losing the near-linear complexity of the H-matrix factorization. In this work, we propose to factorize the matrix using a "lattice H-matrix" format that generalizes the BLR format by storing each of the blocks (both diagonals and off-diagonals) in the H-matrix format. These blocks stored in the H-matrix format are referred to as lattices. Thus, this lattice format aims to combine the parallel scalability of BLR factorization with the near-linear complexity of H-matrix factorization. We first compare factorization performances using the H-matrix, BLR, and lattice H-matrix formats under various conditions on a shared-memory computer. Our performance results show that the lattice format has storage and computational complexities similar to those of the H-matrix format, and hence a much lower cost of factorization than BLR. We then compare the BLR and lattice H-matrix factorization on distributed-memory computers. Our performance results demonstrate that compared with BLR, the lattice format with the lower cost of factorization may lead to faster factorization on the distributed-memory computer.
引用
收藏
页码:1046 / 1063
页数:18
相关论文
共 50 条
  • [1] Parallel H-matrix arithmetic on distributed-memory systems
    Izadi, Mohammad
    COMPUTING AND VISUALIZATION IN SCIENCE, 2012, 15 (02) : 87 - 97
  • [2] Distributed-Memory H-Matrix Algebra I: Data Distribution and Matrix-Vector Multiplication
    Li, Yingzhou
    Poulson, Jack
    Ying, Lexing
    CSIAM TRANSACTIONS ON APPLIED MATHEMATICS, 2021, 2 (03): : 431 - 459
  • [3] Distributed-Memory Parallel Symmetric Nonnegative Matrix Factorization
    Eswar, Srinivas
    Hayashi, Koby
    Ballard, Grey
    Kannan, Ramakrishnan
    Vuduc, Richard
    Park, Haesun
    PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20), 2020,
  • [4] ADAPTIVE BLOCK TECHNIQUES FOR MATRIX FACTORIZATION ON DISTRIBUTED-MEMORY COMPUTERS
    STRAUSS, H
    RONSCH, W
    HYPERCUBE AND DISTRIBUTED COMPUTERS, 1989, : 355 - 356
  • [5] Lattice H-Matrices on Distributed-Memory Systems
    Ida, Akihiro
    2018 32ND IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2018, : 389 - 398
  • [6] Distributed-memory hierarchical interpolative factorization
    Li, Yingzhou
    Ying, Lexing
    RESEARCH IN THE MATHEMATICAL SCIENCES, 2017, 4
  • [7] Distributed-memory hierarchical interpolative factorization
    Yingzhou Li
    Lexing Ying
    Research in the Mathematical Sciences, 4
  • [8] Parallel sparse orthogonal factorization on distributed-memory multiprocessors
    Sun, CG
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1996, 17 (03): : 666 - 685
  • [9] LU FACTORIZATION ALGORITHMS ON DISTRIBUTED-MEMORY MULTIPROCESSOR ARCHITECTURES
    GEIST, GA
    ROMINE, CH
    SIAM JOURNAL ON SCIENTIFIC AND STATISTICAL COMPUTING, 1988, 9 (04): : 639 - 649
  • [10] A variant of block incomplete factorization preconditioners for a symmetric H-matrix
    Yun, Jae Heon
    Kim, Sang Wook
    Journal of Applied Mathematics and Computing, 2001, 8 (03) : 481 - 496