Distributed-memory lattice H-matrix factorization

被引:12
|
作者
Yamazaki, Ichitaro [1 ]
Ida, Akihiro [2 ]
Yokota, Rio [3 ]
Dongarra, Jack [4 ]
机构
[1] Sandia Natl Labs, Comp Sci Res Inst, Albuquerque, NM 87123 USA
[2] Univ Tokyo, Informat Technol Ctr, Supercomp Res Div, Tokyo, Japan
[3] Tokyo Inst Technol, Global Sci Informat & Comp Ctr, Tokyo, Japan
[4] Univ Tennessee, Dept Elect Engn & Comp Sci, Knoxville, TN USA
基金
美国国家科学基金会;
关键词
boundary element method; LU factorization; distributed memory; hierarchical matrix; task programming; APPROXIMATION;
D O I
10.1177/1094342019861139
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We parallelize the LU factorization of a hierarchical low-rank matrix (H-matrix) on a distributed-memory computer. This is much more difficult than the H-matrix-vector multiplication due to the dataflow of the factorization, and it is much harder than the parallelization of a dense matrix factorization due to the irregular hierarchical block structure of the matrix. Block low-rank (BLR) format gets rid of the hierarchy and simplifies the parallelization, often increasing concurrency. However, this comes at a price of losing the near-linear complexity of the H-matrix factorization. In this work, we propose to factorize the matrix using a "lattice H-matrix" format that generalizes the BLR format by storing each of the blocks (both diagonals and off-diagonals) in the H-matrix format. These blocks stored in the H-matrix format are referred to as lattices. Thus, this lattice format aims to combine the parallel scalability of BLR factorization with the near-linear complexity of H-matrix factorization. We first compare factorization performances using the H-matrix, BLR, and lattice H-matrix formats under various conditions on a shared-memory computer. Our performance results show that the lattice format has storage and computational complexities similar to those of the H-matrix format, and hence a much lower cost of factorization than BLR. We then compare the BLR and lattice H-matrix factorization on distributed-memory computers. Our performance results demonstrate that compared with BLR, the lattice format with the lower cost of factorization may lead to faster factorization on the distributed-memory computer.
引用
收藏
页码:1046 / 1063
页数:18
相关论文
共 50 条
  • [1] Distributed-Memory H-Matrix Algebra I: Data Distribution and Matrix-Vector Multiplication
    Li, Yingzhou
    Poulson, Jack
    Ying, Lexing
    CSIAM TRANSACTIONS ON APPLIED MATHEMATICS, 2021, 2 (03): : 431 - 459
  • [2] Distributed-memory hierarchical interpolative factorization
    Li, Yingzhou
    Ying, Lexing
    RESEARCH IN THE MATHEMATICAL SCIENCES, 2017, 4
  • [3] Communication lower bounds for distributed-memory matrix multiplication
    Irony, D
    Toledo, S
    Tiskin, A
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2004, 64 (09) : 1017 - 1026
  • [4] PARALLEL ANNEALING ON DISTRIBUTED-MEMORY SYSTEMS
    LEE, FH
    STILES, GS
    SWAMINATHAN, V
    PROGRAMMING AND COMPUTER SOFTWARE, 1995, 21 (01) : 1 - 8
  • [5] Memory Reduced Half Hierarchal Matrix (H-Matrix) for Electrodynamic Electric Field Integral Equation
    Negi, Yoginder K.
    PROGRESS IN ELECTROMAGNETICS RESEARCH LETTERS, 2021, 96 : 91 - 96
  • [6] Linear Systems Solvers for Distributed-Memory Machines with GPU Accelerators
    Kurzak, Jakub
    Gates, Mark
    Charara, Ali
    YarKhan, Asim
    Yamazaki, Ichitaro
    Dongarra, Jack
    EURO-PAR 2019: PARALLEL PROCESSING, 2019, 11725 : 495 - 506
  • [7] DISTRIBUTED-MEMORY COMPILER DESIGN FOR SPARSE PROBLEMS
    WU, J
    DAS, R
    SALTZ, J
    BERRYMAN, H
    HIRANANDANI, S
    IEEE TRANSACTIONS ON COMPUTERS, 1995, 44 (06) : 737 - 753
  • [8] Communication Lower Bounds for Distributed-Memory Computations
    Scquizzato, Michele
    Silvestri, Francesco
    31ST INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE (STACS 2014), 2014, 25 : 627 - 638
  • [9] H-matrix approximability of the inverses of FEM matrices
    Faustmann, Markus
    Melenk, Jens Markus
    Praetorius, Dirk
    NUMERISCHE MATHEMATIK, 2015, 131 (04) : 615 - 642
  • [10] Adaptive H-matrix computations in linear elasticity
    Bauer, Maximilian
    Bebendorf, Mario
    APPLIED NUMERICAL MATHEMATICS, 2024, 201 : 1 - 19