Automatic Tuning of Sparse Matrix-Vector Multiplication for CRS format on GPUs

被引:11
|
作者
Yoshizawa, Hiroki [1 ]
Takahashi, Daisuke [2 ]
机构
[1] Univ Tsukuba, Grad Sch Syst & Informat Engn, 1-1-1 Tennodai, Tsukuba, Ibaraki 3058573, Japan
[2] Univ Tsukuba, Fac Engn Informat & Syst, Tsukuba, Ibaraki 3058573, Japan
来源
15TH IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2012) / 10TH IEEE/IFIP INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC 2012) | 2012年
基金
日本科学技术振兴机构;
关键词
SpMV; CRS; CG; GPGPU; CUDA;
D O I
10.1109/ICCSE.2012.28
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Performance of sparse matrix-vector multiplication (SpMV) on GPUs is highly dependent on the structure of the sparse matrix used in the computation, the computing environment, and the selection of certain parameters. In this paper, we show that the performance achieved using kernel SpMV on GPUs for the compressed row storage (CRS) format depends greatly on optimal selection of a parameter, and we propose an efficient algorithm for the automatic selection of the optimal parameter. Kernel SpMV for the CRS format using automatic parameter selection achieves up to approximately 26% improvement over NVIDIA's CUSPARSE library. The conjugate gradient method is the most popular iterative method for solving sparse systems of linear equations. Kernel SpMV makes up the bulk of the conjugate gradient method calculations. By optimizing SpMV using our approach, the conjugate gradient method performs up to approximately 10% better than CULA Sparse.
引用
收藏
页码:130 / 136
页数:7
相关论文
共 50 条
  • [1] Optimization of Sparse Matrix-Vector Multiplication for CRS Format on NVIDIA Kepler Architecture GPUs
    Mukunoki, Daichi
    Takahashi, Daisuke
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2013, PT V, 2013, 7975 : 211 - 223
  • [2] The Sliced COO format for Sparse Matrix-Vector Multiplication on CUDA-enabled GPUs
    Dang, Hoang-Vu
    Schmidt, Bertil
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012, 2012, 9 : 57 - 66
  • [3] Automatic tuning of sparse matrix-vector multiplication on multicore clusters
    Li ShiGang
    Hu ChangJun
    Zhang JunChao
    Zhang YunQuan
    SCIENCE CHINA-INFORMATION SCIENCES, 2015, 58 (09) : 1 - 14
  • [4] Automatic tuning of sparse matrix-vector multiplication on multicore clusters
    LI ShiGang
    HU ChangJun
    ZHANG JunChao
    ZHANG YunQuan
    Science China(Information Sciences), 2015, 58 (09) : 17 - 30
  • [5] Iterative Sparse Matrix-Vector Multiplication for Integer Factorization on GPUs
    Schmidt, Bertil
    Aribowo, Hans
    Dang, Hoang-Vu
    EURO-PAR 2011 PARALLEL PROCESSING, PT 2, 2011, 6853 : 413 - 424
  • [6] Optimization techniques for sparse matrix-vector multiplication on GPUs
    Maggioni, Marco
    Berger-Wolf, Tanya
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2016, 93-94 : 66 - 86
  • [7] Sparse Matrix-Vector Product for the bmSparse Matrix Format in GPUs
    Berger, Gonzalo
    Dufrechou, Ernesto
    Ezzatti, Pablo
    EURO-PAR 2023: PARALLEL PROCESSING WORKSHOPS, PT I, EURO-PAR 2023, 2024, 14351 : 246 - 256
  • [8] Multiple-precision sparse matrix-vector multiplication on GPUs
    Isupov, Konstantin
    JOURNAL OF COMPUTATIONAL SCIENCE, 2022, 61
  • [9] Scaleable Sparse Matrix-Vector Multiplication with Functional Memory and GPUs
    Tanabe, Noboru
    Ogawa, Yuuka
    Takata, Masami
    Joe, Kazuki
    PROCEEDINGS OF THE 19TH INTERNATIONAL EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING, 2011, : 101 - 108
  • [10] Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications
    Ashari, Arash
    Sedaghati, Naser
    Eisenlohr, John
    Parthasarathy, Srinivasan
    Sadayappan, P.
    SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, : 781 - 792