ParILUT - A Parallel Threshold ILU for GPUs

被引:7
作者
Anzt, Hartwig [1 ,2 ]
Ribizel, Tobias [1 ]
Flegar, Goran [3 ]
Chow, Edmond [4 ]
Dongarra, Jack [2 ,5 ,6 ]
机构
[1] Karlsruhe Inst Technol, Steinbuch Ctr Comp, Karlsruhe, Germany
[2] Univ Tennessee, Innovat Comp Lab ICL, Knoxville, TN 37996 USA
[3] Univ Jaume I Castellon, Dept Ingn & Ciencia Comp, Castellon De La Plana, Spain
[4] Georgia Inst Technol, Sch Computat Sci & Engn, Atlanta, GA 30332 USA
[5] Univ Manchester, Manchester, Lancs, England
[6] Oak Ridge Natl Lab ORNL, Oak Ridge, TN USA
来源
2019 IEEE 33RD INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2019) | 2019年
关键词
ParILUT; parallel threshold ILU; incomplete factorization preconditioners; parallel selection; GPU;
D O I
10.1109/IPDPS.2019.00033
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present the first algorithm for computing threshold ILU factorizations on GPU architectures. The proposed ParILUT-GPU algorithm is based on interleaving parallel fixed-point iterations that approximate the incomplete factors for an existing nonzero pattern with a strategy that dynamically adapts the nonzero pattern to the problem characteristics. This requires the efficient selection of thresholds that separate the values to be dropped from the incomplete factors, and we design a novel selection algorithm tailored towards GPUs. All components of the ParILUT-GPU algorithm make heavy use of the features available in the latest NVIDIA GPU generations, and outperform existing multithreaded CPU implementations.
引用
收藏
页码:231 / 241
页数:11
相关论文
共 21 条
  • [1] Adinets A., OPTIMIZED FILTERING
  • [2] Anzt H., 2017, ACCELERATING CONJUGA, P35
  • [3] PARILUT-A NEW PARALLEL THRESHOLD ILU FACTORIZATION
    Anzt, Hartwig
    Chow, Edmond
    Dongarra, Jack
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2018, 40 (04) : C503 - C519
  • [4] Preconditioned Krylov solvers on GPUs
    Anzt, Hartwig
    Gates, Mark
    Dongarra, Jack
    Kreutzer, Moritz
    Wellein, Gerhard
    Koehler, Martin
    [J]. PARALLEL COMPUTING, 2017, 68 : 32 - 44
  • [5] Updating incomplete factorization preconditioners for model order reduction
    Anzt, Hartwig
    Chow, Edmond
    Saak, Jens
    Dongarra, Jack
    [J]. NUMERICAL ALGORITHMS, 2016, 73 (03) : 611 - 630
  • [6] Basermann A, 2000, NUMER LINEAR ALGEBR, V7, P635, DOI 10.1002/1099-1506(200010/12)7:7/8<635::AID-NLA216>3.0.CO
  • [7] 2-B
  • [8] Benzi M., 1999, ELECTRON T NUMER ANA, V8, P88
  • [9] Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs
    Chow, Edmond
    Anzt, Hartwig
    Dongarra, Jack
    [J]. HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2015, 2015, 9137 : 1 - 16
  • [10] FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION
    Chow, Edmond
    Patel, Aftab
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2015, 37 (02) : C169 - C193