Efficient Sparse-Matrix Multi-Vector Product on GPUs

被引：44

作者：

Hong, Changwan ^{[1
]}

Sukumaran-Rajam, Aravind ^{[1
]}

Bandyopadhyay, Bortik ^{[1
]}

Kim, Jinsung ^{[1
]}

Kurt, Sureyya Emre ^{[1
]}

Nisa, Israt ^{[1
]}

Sabhlok, Shivani ^{[1
]}

Catalyurek, Umit V. ^{[2
]}

Parthasarathy, Srinivasan ^{[1
]}

Sadayappan, P. ^{[1
]}

机构：

[1] Ohio State Univ, Columbus, OH 43210 USA

[2] Georgia Inst Technol, Atlanta, GA 30332 USA

来源：

HPDC '18: PROCEEDINGS OF THE 27TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING | 2018年

基金：

美国国家科学基金会;

关键词：

Sparse Matrix-Vector Multiplication; Sparse Matrix-Matrix Multiplication; Sparse Matrix Multi-Vector Multiplication; GPU; PERFORMANCE; SPMV; EIGENVALUES; ALGORITHM;

D O I：

10.1145/3208040.3208062

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Sparse Matrix-Vector (SpMV) and Sparse Matrix-Multivector (SpMM) products are key kernels for computational science and data science. While GPUs offer significantly higher peak performance and memory bandwidth than multicore CPUs, achieving high performance on sparse computations on GPUs is very challenging. A tremendous amount of recent research has focused on various GPU implementations of the SpMV kernel. But the multi-vector SpMM kernel has received much less attention. In this paper, we present an in-depth analysis to contrast SpMV and SpMM, and develop a new sparse-matrix representation and computation approach suited to achieving high data-movement efficiency and effective GPU parallelization of SpMM. Experimental evaluation using the entire SuiteSparse matrix suite demonstrates significant performance improvement over existing SpMM implementations from vendor libraries.

引用

页码：66 / 79

页数：14

共 39 条

[1] Optimizing Sparse Matrix-Multiple Vectors Multiplication for Nuclear Configuration Interaction Calculations [J].

Aktulga, Hasan Metin ;

Buluc, Aydin ;

Williams, Samuel ;

Yang, Chao .

2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,

[2]

[Anonymous], 2011, SuiteSparse matrix collection

[3]

[Anonymous], 2009, SIGKDD Explorations, DOI DOI 10.1145/1656274.1656278

[4]

[Anonymous], 2018, The API reference guide for cuSPARSE, the CUDA sparse matrixlibrary.(v8.0 ed.).

[5] Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications [J].

Ashari, Arash ;

Sedaghati, Naser ;

Eisenlohr, John ;

Parthasarathy, Srinivasan ;

Sadayappan, P. .

SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, :781-792

[6]

Bai Z., 2000, Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. Ed. by, DOI DOI 10.1137/1.9780898719581

[7] On improving linear solver performance: A block variant of GMRES [J].

Baker, AH ;

Dennis, JM ;

Jessup, ER .

SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2006, 27 (05) :1608-1626

[8] Sparse Matrix Format Selection with Multiclass SVM for SpMV on GPU [J].

Benatia, Akrem ;

Ji, Weixing ;

Wang, Yizhuo ;

Shi, Feng .

PROCEEDINGS 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - ICPP 2016, 2016, :496-505

[9]

Buluç A, 2008, 2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, P1876

[10] Strategies for spectrum slicing based on restarted Lanczos methods [J].

Campos, Carmen ;

Roman, Jose E. .

NUMERICAL ALGORITHMS, 2012, 60 (02) :279-295

← 1 2 3 4 →