Merge-based Parallel Sparse Matrix-Vector Multiplication

被引：0

作者：

Merrill, Duane ^{[1
]}

Garland, Michael ^{[1
]}

机构：

[1] NVIDIA Corp, Santa Clara, CA 95050 USA

来源：

SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS | 2016年

关键词：

SpMV; sparse matrix; parallel merge; merge-path; many-core; GPU; linear algebra;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We present a strictly balanced method for the parallel computation of sparse matrix-vector products (SpMV). Our algorithm operates directly upon the Compressed Sparse Row (CSR) sparse matrix format without preprocessing, inspection, reformatting, or supplemental encoding. Regardless of nonzero structure, our equitable 2D merge-based decomposition tightly bounds the workload assigned to each processing element. Furthermore, our technique is suitable for recursively partitioning CSR datasets themselves into multiscale, distributed, NUMA, and GPU environments that are constrained by fixed-size local memories. We evaluate our method on both CPU and GPU microarchitectures across a very large corpus of diverse sparse matrix datasets. We show that traditional CsrMV methods are inconsistent performers, often subject to order-of-magnitude performance variation across similarly-sized datasets. In comparison, our method provides predictable performance that is substantially uncorrelated to the distribution of nonzeros among rows and broadly improves upon that of current CsrMV methods.

引用

页码：678 / 689

页数：12

共 33 条

[1]

[Anonymous], 2015, INT MATH KERN LIB MK

[2]

[Anonymous], THESIS

[3]

[Anonymous], SIAM INT C DAT MIN

[4]

[Anonymous], 2008, NVIDIA Technical Report NVR-2008-004

[5]

[Anonymous], 2003, CITESEERX

[6]

[Anonymous], 2013, NVIDIA CUSPARSE V7 5

[7]

[Anonymous], CUB V1 5 3 CUDA UNBO

[8] Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications [J].

Ashari, Arash ;

Sedaghati, Naser ;

Eisenlohr, John ;

Parthasarathy, Srinivasan ;

Sadayappan, P. .

SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, :781-792

[9]

Baxter S, 2013, EFFICIENT MERGE SEAR

[10]

Baxter Sean., 2013, Modern GPU

← 1 2 3 4 →