Deciding Non-Compressible Blocks in Sparse Direct Solvers using Incomplete Factorization

被引:0
作者
Korkmaz, Esragul [1 ]
Faverge, Mathieu [1 ]
Ramet, Pierre [1 ]
Pichon, Gregoire [2 ]
机构
[1] Univ Bordeaux, UMR 5800, CNRS, Bordeaux INP,Inria, Talence, France
[2] Univ Lyon, EnsL, UCBL, CNRS,Inria,LIP, Lyon, France
来源
2021 IEEE 28TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC 2021) | 2021年
关键词
sparse direct solvers; low-rank compression; ILU factorization; PERFORMANCE; MATRICES;
D O I
10.1109/HiPC53243.2021.00024
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Low-rank compression techniques are very promising for reducing memory footprint and execution time on a large spectrum of linear solvers. Sparse direct supernodal approaches are one of these techniques. However, despite providing a very good scalability and reducing the memory footprint, they suffer from an important flops overhead in their unstructured low-rank updates. As a consequence, the execution time is not improved as expected. In this paper, we study a solution to improve low-rank compression techniques in sparse supernodal solvers. The proposed method tackles the overprice of the low-rank updates by identifying the blocks that have poor compression rates. We show that the fill-in levels of the graph based block incomplete LU factorization can be used in a new context to identify most of these non-compressible blocks at low cost. This identification enables to postpone the low-rank compression step to trade small extra memory consumption for a better time to solution. The solution is validated within the PASTIX library with a large set of application matrices. It demonstrates sequential and multi-threaded speedup up to 8.5x, for small memory overhead of less than 1.49x with respect to the original version.
引用
收藏
页码:101 / 110
页数:10
相关论文
共 28 条
  • [11] Duff Iain S, 2017, Direct methods for sparse matrices
  • [12] AN EFFICIENT MULTICORE IMPLEMENTATION OF A NOVEL HSS-STRUCTURED MULTIFRONTAL SOLVER USING RANDOMIZED SAMPLING
    Ghysels, Pieter
    Li, Xiaoye S.
    Rouet, Francois-Henry
    Williams, Samuel
    Napov, Artem
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2016, 38 (05) : S358 - S384
  • [13] ENHANCING PERFORMANCE AND ROBUSTNESS OF ILU PRECONDITIONERS BY BLOCKING AND SELECTIVE TRANSPOSITION
    Gupta, Anshul
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2017, 39 (01) : A303 - A332
  • [14] Hackbusch W, 2015, SPR SER COMPUT MATH, V49, pCP5, DOI 10.1007/978-3-662-47324-5
  • [15] Hackbusch W, 1999, COMPUTING, V62, P89, DOI 10.1007/s006070050015
  • [16] Data-sparse approximation by adaptive H2-matrices
    Hackbusch, W
    Börm, S
    [J]. COMPUTING, 2002, 69 (01) : 1 - 35
  • [17] On finding approximate supernodes for an efficient block-ILU(k) factorization
    Henon, Pascal
    Ramet, Pierre
    Roman, Jean
    [J]. PARALLEL COMPUTING, 2008, 34 (6-8) : 345 - 362
  • [18] A scalable parallel algorithm for incomplete factor preconditioning
    Hysom, D
    Pothen, A
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2001, 22 (06) : 2194 - 2215
  • [19] Karypis G., 1997, P IEEE ACM SC97 C
  • [20] Lize B., 2014, THESIS PARIS