Sparse Matrix-Vector Multiplication Optimizations based on Matrix Bandwidth Reduction using NVIDIA CUDA

被引：7

作者：

Xu, Shiming ^{[1
]}

Lin, Hai Xiang ^{[1
]}

Xue, Wei ^{[2
]}

机构：

[1] Delft Univ Technol, Delft Inst Appl Math, Delft, Netherlands

[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES 2010) | 2010年

关键词：

SpMV; GP-GPU; NVIDIA CUDA; RCM;

D O I：

10.1109/DCABES.2010.162

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In this paper we propose the optimization of sparse matrix-vector multiplication (SpMV) with CUDA based on matrix bandwidth/profile reduction techniques. Computational time required to access dense vector is decoupled from SpMV computation. By reducing the matrix profile, the time required to access dense vector is reduced by 17% (for SP) and 24% (for DP). Reduced matrix bandwidth enables column index information compression with shorter formats, resulting in a 17% (for SP) and 10% (for DP) execution time reduction for accessing matrix data under ELLPACK format. The overall speedup for SpMV is 16% and 12.6% for the whole matrix test suite. The optimization proposed in this paper can be combined with other SpMV optimizations such as register blocking.

引用

页码：609 / 614

页数：6

共 50 条

[41] An Effective Approach for Implementing Sparse Matrix-Vector Multiplication on Graphics Processing Units
Abu-Sufah, Walid
Karim, Asma Abdel
2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 453 - 460
[42] TaiChi: A Hybrid Compression Format for Binary Sparse Matrix-Vector Multiplication on GPU
Gao, Jianhua
Ji, Weixing
Tan, Zhaonian
Wang, Yizhuo
Shi, Feng
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3732 - 3745
[43] SMAT: An Input Adaptive Auto-Tuner for Sparse Matrix-Vector Multiplication
Li, Jiajia
Tan, Guangming
Chen, Mingyu
Sun, Ninghui
ACM SIGPLAN NOTICES, 2013, 48 (06) : 117 - 126
[44] Multi-Mode Transprecision Sparse Matrix-Vector Multiplication Engine for PageRank
Kim, Whijin
Lee, Jihye
Kim, Sujin
Kim, Ji-Hoon
2022 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2022,
[45] Multi-GPU Implementation and Performance Optimization for CSR-Based Sparse Matrix-Vector Multiplication
Guo, Ping
Zhang, Changjiang
PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 2419 - 2423
[46] SparseX: A Library for High-Performance Sparse Matrix-Vector Multiplication on Multicore Platforms
Elafrou, Athena
Karakasis, Vasileios
Gkountouvas, Theodoros
Kourtis, Kornilios
Goumas, Georgios
Koziris, Nectarios
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2018, 44 (03):
[47] An Efficient Two-Dimensional Blocking Strategy for Sparse Matrix-Vector Multiplication on GPUs
Ashari, Arash
Sedaghati, Naser
Eisenlohr, John
Sadayappan, P.
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, (ICS'14), 2014, : 273 - 282
[48] Sparse Matrix-Vector Product for the bmSparse Matrix Format in GPUs
Berger, Gonzalo
Dufrechou, Ernesto
Ezzatti, Pablo
EURO-PAR 2023: PARALLEL PROCESSING WORKSHOPS, PT I, EURO-PAR 2023, 2024, 14351 : 246 - 256
[49] Exploring Better Speculation and Data Locality in Sparse Matrix-Vector Multiplication on Intel Xeon
Zhao, Haoran
Xia, Tian
Li, Chenyang
Zhao, Wenzhe
Zheng, Nanning
Ren, Pengju
2020 IEEE 38TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2020), 2020, : 601 - 609
[50] A TASK-SCHEDULING APPROACH FOR EFFICIENT SPARSE SYMMETRIC MATRIX-VECTOR MULTIPLICATION ON A GPU
Mironowicz, P.
Dziekonski, A.
Mrozowski, M.
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2015, 37 (06) : C643 - C666

← 1 2 3 4 5 →