The Sliced COO format for Sparse Matrix-Vector Multiplication on CUDA-enabled GPUs

被引:15
作者
Dang, Hoang-Vu [1 ]
Schmidt, Bertil [1 ]
机构
[1] Johannes Gutenberg Univ Mainz, Inst Informat, D-55128 Mainz, Germany
来源
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012 | 2012年 / 9卷
关键词
SpMV; CUDA; Fermi;
D O I
10.1016/j.procs.2012.04.007
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Existing formats for Sparse Matrix-Vector Multiplication (SpMV) on the GPU are outperforming their corresponding implementations on multi-core CPUs. In this paper, we present a new format called Sliced COO (SCOO) and an efficient CUDA implementation to perform SpMV on the GPU. While previous work shows experiments on small to medium-sized sparse matrices, we perform evaluations on large sparse matrices. We compared SCOO performance to existing formats of the NVIDIA Cusp library. Our resutls on a Fermi GPU show that SCOO outperforms the COO and CSR format for all tested matrices and the HYB format for all tested unstructured matrices. Furthermore, comparison to a Sandy-Bridge CPU shows that SCOO on a Fermi GPU outperforms the multi-threaded CSR implementation of the Intel MKL Library on an i7-2700K by a factor between 5.5 and 18.
引用
收藏
页码:57 / 66
页数:10
相关论文
共 15 条
  • [1] [Anonymous], SIGPLAN NOT
  • [2] [Anonymous], 2003, ITERATIVE METHODS SP, DOI DOI 10.1137/1.9780898718003
  • [3] Baskaran M.M., 2008, OPTIMIZING SPARSE MA
  • [4] Bell N, 2009, STUDENTS GUIDE TO THE MA TESOL, P1
  • [5] Bell Nathan., 2010, Cusp: Generic parallel algorithms for sparse matrix and graph computations
  • [6] Buatois L., INT J PARALLEL EMERG
  • [7] BUATOIS L, 2007, HIGH PERF COMP C HPC
  • [8] SPARSE-MATRIX TEST PROBLEMS
    DUFF, IS
    GRIMES, RG
    LEWIS, JG
    [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1989, 15 (01): : 1 - 14
  • [9] Kohl J., 2008, MATVIEW SCALABLE SPA
  • [10] Merrill Duane, 2012, P 17 ACM SIGPLAN S P