Automatic Tuning of Sparse Matrix-Vector Multiplication for CRS format on GPUs

被引:11
|
作者
Yoshizawa, Hiroki [1 ]
Takahashi, Daisuke [2 ]
机构
[1] Univ Tsukuba, Grad Sch Syst & Informat Engn, 1-1-1 Tennodai, Tsukuba, Ibaraki 3058573, Japan
[2] Univ Tsukuba, Fac Engn Informat & Syst, Tsukuba, Ibaraki 3058573, Japan
来源
15TH IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2012) / 10TH IEEE/IFIP INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC 2012) | 2012年
基金
日本科学技术振兴机构;
关键词
SpMV; CRS; CG; GPGPU; CUDA;
D O I
10.1109/ICCSE.2012.28
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Performance of sparse matrix-vector multiplication (SpMV) on GPUs is highly dependent on the structure of the sparse matrix used in the computation, the computing environment, and the selection of certain parameters. In this paper, we show that the performance achieved using kernel SpMV on GPUs for the compressed row storage (CRS) format depends greatly on optimal selection of a parameter, and we propose an efficient algorithm for the automatic selection of the optimal parameter. Kernel SpMV for the CRS format using automatic parameter selection achieves up to approximately 26% improvement over NVIDIA's CUSPARSE library. The conjugate gradient method is the most popular iterative method for solving sparse systems of linear equations. Kernel SpMV makes up the bulk of the conjugate gradient method calculations. By optimizing SpMV using our approach, the conjugate gradient method performs up to approximately 10% better than CULA Sparse.
引用
收藏
页码:130 / 136
页数:7
相关论文
共 50 条
  • [31] On Implementing Sparse Matrix Multi-Vector Multiplication on GPUs
    Abu-Sufah, Walid
    Ahmad, Khalid
    2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS), 2014, : 1117 - 1124
  • [32] Merge-based Sparse Matrix-Vector Multiplication (SpMV) using the CSR Storage Format
    Merrill, Duane
    Garland, Michael
    ACM SIGPLAN NOTICES, 2016, 51 (08) : 389 - 390
  • [33] Adaptive sparse matrix representation for efficient matrix-vector multiplication
    Zardoshti, Pantea
    Khunjush, Farshad
    Sarbazi-Azad, Hamid
    JOURNAL OF SUPERCOMPUTING, 2016, 72 (09) : 3366 - 3386
  • [34] Parallel Sparse Matrix-Vector Multiplication Using Accelerators
    Maeda, Hiroshi
    Takahashi, Daisuke
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2016, PT II, 2016, 9787 : 3 - 18
  • [35] Implementing Sparse Matrix-Vector Multiplication with QCSR on GPU
    Zhang, Jilin
    Liu, Enyi
    Wan, Jian
    Ren, Yongjian
    Yue, Miao
    Wang, Jue
    APPLIED MATHEMATICS & INFORMATION SCIENCES, 2013, 7 (02): : 473 - 482
  • [36] A New Method of Sparse Matrix-Vector Multiplication on GPU
    Huan, Gao
    Qian, Zhang
    PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 954 - 958
  • [37] POSTER: BASMAT: Bottleneck-Aware Sparse Matrix-Vector Multiplication Auto-Tuning on GPGPUs
    Elafrou, Athena
    Goumas, Georgios
    Koziris, Nectarios
    PROCEEDINGS OF THE 24TH SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '19), 2019, : 423 - 424
  • [38] CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication
    Liu, Weifeng
    Vinter, Brian
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 339 - 350
  • [39] Adaptive Hybrid Storage Format for Sparse Matrix-Vector Multiplication on Multi-Core SIMD CPUs
    Chen, Shizhao
    Fang, Jianbin
    Xu, Chuanfu
    Wang, Zheng
    APPLIED SCIENCES-BASEL, 2022, 12 (19):
  • [40] Efficient Sparse Matrix-Vector Multiplication on Intel PIUMA Architecture
    Aananthakrishnan, Sriram
    Pawlowski, Robert
    Fryman, Joshua
    Hur, Ibrahim
    2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,