The Sliced COO format for Sparse Matrix-Vector Multiplication on CUDA-enabled GPUs

被引：15

作者：

Dang, Hoang-Vu ^{[1
]}

Schmidt, Bertil ^{[1
]}

机构：

[1] Johannes Gutenberg Univ Mainz, Inst Informat, D-55128 Mainz, Germany

来源：

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012 | 2012年 / 9卷

关键词：

SpMV; CUDA; Fermi;

D O I：

10.1016/j.procs.2012.04.007

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Existing formats for Sparse Matrix-Vector Multiplication (SpMV) on the GPU are outperforming their corresponding implementations on multi-core CPUs. In this paper, we present a new format called Sliced COO (SCOO) and an efficient CUDA implementation to perform SpMV on the GPU. While previous work shows experiments on small to medium-sized sparse matrices, we perform evaluations on large sparse matrices. We compared SCOO performance to existing formats of the NVIDIA Cusp library. Our resutls on a Fermi GPU show that SCOO outperforms the COO and CSR format for all tested matrices and the HYB format for all tested unstructured matrices. Furthermore, comparison to a Sandy-Bridge CPU shows that SCOO on a Fermi GPU outperforms the multi-threaded CSR implementation of the Intel MKL Library on an i7-2700K by a factor between 5.5 and 18.

引用

页码：57 / 66

页数：10

共 15 条

[1] [Anonymous], SIGPLAN NOT
[2] [Anonymous], 2003, ITERATIVE METHODS SP, DOI DOI 10.1137/1.9780898718003
[3] Baskaran M.M., 2008, OPTIMIZING SPARSE MA
[4] Bell N, 2009, STUDENTS GUIDE TO THE MA TESOL, P1
[5] Bell Nathan., 2010, Cusp: Generic parallel algorithms for sparse matrix and graph computations
[6] Buatois L., INT J PARALLEL EMERG
[7] BUATOIS L, 2007, HIGH PERF COMP C HPC
[8] SPARSE-MATRIX TEST PROBLEMS
DUFF, IS
GRIMES, RG
LEWIS, JG
[J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1989, 15 (01): : 1 - 14
[9] Kohl J., 2008, MATVIEW SCALABLE SPA
[10] Merrill Duane, 2012, P 17 ACM SIGPLAN S P

← 1 2 →