Sparse Matrix-Vector Multiplication Optimizations based on Matrix Bandwidth Reduction using NVIDIA CUDA

被引:7
|
作者
Xu, Shiming [1 ]
Lin, Hai Xiang [1 ]
Xue, Wei [2 ]
机构
[1] Delft Univ Technol, Delft Inst Appl Math, Delft, Netherlands
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
来源
PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES 2010) | 2010年
关键词
SpMV; GP-GPU; NVIDIA CUDA; RCM;
D O I
10.1109/DCABES.2010.162
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper we propose the optimization of sparse matrix-vector multiplication (SpMV) with CUDA based on matrix bandwidth/profile reduction techniques. Computational time required to access dense vector is decoupled from SpMV computation. By reducing the matrix profile, the time required to access dense vector is reduced by 17% (for SP) and 24% (for DP). Reduced matrix bandwidth enables column index information compression with shorter formats, resulting in a 17% (for SP) and 10% (for DP) execution time reduction for accessing matrix data under ELLPACK format. The overall speedup for SpMV is 16% and 12.6% for the whole matrix test suite. The optimization proposed in this paper can be combined with other SpMV optimizations such as register blocking.
引用
收藏
页码:609 / 614
页数:6
相关论文
共 50 条
  • [41] An Effective Approach for Implementing Sparse Matrix-Vector Multiplication on Graphics Processing Units
    Abu-Sufah, Walid
    Karim, Asma Abdel
    2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 453 - 460
  • [42] TaiChi: A Hybrid Compression Format for Binary Sparse Matrix-Vector Multiplication on GPU
    Gao, Jianhua
    Ji, Weixing
    Tan, Zhaonian
    Wang, Yizhuo
    Shi, Feng
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3732 - 3745
  • [43] SMAT: An Input Adaptive Auto-Tuner for Sparse Matrix-Vector Multiplication
    Li, Jiajia
    Tan, Guangming
    Chen, Mingyu
    Sun, Ninghui
    ACM SIGPLAN NOTICES, 2013, 48 (06) : 117 - 126
  • [44] Multi-Mode Transprecision Sparse Matrix-Vector Multiplication Engine for PageRank
    Kim, Whijin
    Lee, Jihye
    Kim, Sujin
    Kim, Ji-Hoon
    2022 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2022,
  • [45] Multi-GPU Implementation and Performance Optimization for CSR-Based Sparse Matrix-Vector Multiplication
    Guo, Ping
    Zhang, Changjiang
    PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 2419 - 2423
  • [46] SparseX: A Library for High-Performance Sparse Matrix-Vector Multiplication on Multicore Platforms
    Elafrou, Athena
    Karakasis, Vasileios
    Gkountouvas, Theodoros
    Kourtis, Kornilios
    Goumas, Georgios
    Koziris, Nectarios
    ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2018, 44 (03):
  • [47] An Efficient Two-Dimensional Blocking Strategy for Sparse Matrix-Vector Multiplication on GPUs
    Ashari, Arash
    Sedaghati, Naser
    Eisenlohr, John
    Sadayappan, P.
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, (ICS'14), 2014, : 273 - 282
  • [48] Sparse Matrix-Vector Product for the bmSparse Matrix Format in GPUs
    Berger, Gonzalo
    Dufrechou, Ernesto
    Ezzatti, Pablo
    EURO-PAR 2023: PARALLEL PROCESSING WORKSHOPS, PT I, EURO-PAR 2023, 2024, 14351 : 246 - 256
  • [49] Exploring Better Speculation and Data Locality in Sparse Matrix-Vector Multiplication on Intel Xeon
    Zhao, Haoran
    Xia, Tian
    Li, Chenyang
    Zhao, Wenzhe
    Zheng, Nanning
    Ren, Pengju
    2020 IEEE 38TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2020), 2020, : 601 - 609
  • [50] A TASK-SCHEDULING APPROACH FOR EFFICIENT SPARSE SYMMETRIC MATRIX-VECTOR MULTIPLICATION ON A GPU
    Mironowicz, P.
    Dziekonski, A.
    Mrozowski, M.
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2015, 37 (06) : C643 - C666