Systematic Fusion of CUDA Kernels for Iterative Sparse Linear System Solvers

被引:11
作者
Aliaga, Jose I. [1 ]
Perez, Joaquin [1 ]
Quintana-Orti, Enrique S. [1 ]
机构
[1] Univ Jaume 1, Dept Ingn & Ciencia Comp, Castellon de La Plana 12071, Spain
来源
EURO-PAR 2015: PARALLEL PROCESSING | 2015年 / 9233卷
关键词
Graphics processors; CUDA; Sparse linear systems; Iterative solvers;
D O I
10.1007/978-3-662-48096-0_52
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We introduce a systematic analysis in order to fuse CUDA kernels arising in efficient iterative methods for the solution of sparse linear systems. Our procedure characterizes the input and output vectors of these methods, combining this information together with a dependency analysis, in order to decide which kernels to merge. The experiments on a recent NVIDIA "Kepler" GPU report significant gains, especially in energy consumption, for the fused implementations derived from the application of the methodology to three of the most popular Krylov subspace solvers with/without preconditioning.
引用
收藏
页码:675 / 686
页数:12
相关论文
共 14 条
  • [1] Aliaga J.I., 2014, CONCURRENCY IN PRESS
  • [2] Reformulated Conjugate Gradient for the Energy-Aware Solution of Linear Systems on GPUs
    Aliaga, Jose I.
    Perez, Joaquin
    Quintana-Orti, Enrique S.
    Anzt, Hartwig
    [J]. 2013 42ND ANNUAL INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2013, : 320 - 329
  • [3] Optimizing Krylov Subspace Solvers on Graphics Processing Units
    Anzt, Hartwig
    Tomov, Stanimire
    Luszczek, Piotr
    Yamazaki, Ichitaro
    Dongarra, Jack
    Sawyer, William
    [J]. PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2014, : 942 - 950
  • [4] Anzt H., 2011, PROC INT GREEN COMPU, P1
  • [5] Bell Nathan., 2008, EFFICIENT SPARSE MAT
  • [6] Buluc A., 2011, Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2011), P721, DOI 10.1109/IPDPS.2011.73
  • [7] Model-driven Autotuning of Sparse Matrix-Vector Multiply on GPUs
    Choi, Jee W.
    Singh, Amik
    Vuduc, Richard W.
    [J]. ACM SIGPLAN NOTICES, 2010, 45 (05) : 115 - 125
  • [8] Duranton M., 2015, HIPEAC VISION 2015 H
  • [9] Filipovic J., 2013, CORR
  • [10] Fuller SH, 2011, FUTURE OF COMPUTING PERFORMANCE: GAME OVER OR NEXT LEVEL?, P1