Heterogeneous Sparse Matrix Computations on Hybrid GPU/CPU Platforms

被引:2
|
作者
Cardellini, Valeria [1 ]
Fanfarillo, Alessandro [1 ]
Filippone, Salvatore [2 ]
机构
[1] Univ Roma Tor Vergata, Dipartimento Ingn Civile & Ingn Informat, Rome, Italy
[2] Univ Roma Tor Vergata, Dipartimento Ingn Ind, Rome, Italy
来源
PARALLEL COMPUTING: ACCELERATING COMPUTATIONAL SCIENCE AND ENGINEERING (CSE) | 2014年 / 25卷
关键词
Sparse Matrix Computations; Design Patterns; CUDA; GPGPU;
D O I
10.3233/978-1-61499-381-0-203
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Hybrid GPU/CPU clusters are becoming very popular in the scientific computing community, as attested by the number of such systems present in the Top 500 list. In this paper, we address one of the key algorithms for scientific applications: the computation of sparse matrix-vector products that lies at the heart of iterative solvers for sparse linear systems. We detail how design patterns for sparse matrix computations enable us to easily adapt to such a heterogeneous GPU/CPU platform using several sparse matrix formats in order to achieve best performance; then, we analyze static load balancing strategies for devising a suitable data decomposition and propose our approach. We discuss our experience in using different sparse matrix formats and data partitioning algorithms with a number of computational experiments executed on three different hybrid GPU/CPU platforms.
引用
收藏
页码:203 / 212
页数:10
相关论文
共 50 条
  • [21] The Scheduling Based on Machine Learning for Heterogeneous CPU/GPU Systems
    Shulga, D. A.
    Kapustin, A. A.
    Kozlov, A. A.
    Kozyrev, A. A.
    Rovnyagin, M. M.
    PROCEEDINGS OF THE 2016 IEEE NORTH WEST RUSSIA SECTION YOUNG RESEARCHERS IN ELECTRICAL AND ELECTRONIC ENGINEERING CONFERENCE (ELCONRUSNW), 2016, : 345 - 348
  • [22] Image Noise Removal on Heterogeneous CPU-GPU Configurations
    Sanchez, Maria G.
    Vidal, Vicente
    Arnal, Josep
    Vidal, Anna
    2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2014, 29 : 2219 - 2229
  • [23] A hybrid CPU/GPU approach for optimizing sorting throughput
    Gowanlock, Michael
    Karsin, Ben
    PARALLEL COMPUTING, 2019, 85 : 45 - 55
  • [24] Using high performance algorithms for the hybrid simulation of disease dynamics on CPU and GPU
    Leonenko, Vasiliy N.
    Pertsev, Nikolai V.
    Artzrouni, Marc
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2015 COMPUTATIONAL SCIENCE AT THE GATES OF NATURE, 2015, 51 : 150 - 159
  • [25] Approximate similarity search for online multimedia services on distributed CPU–GPU platforms
    George Teodoro
    Eduardo Valle
    Nathan Mariano
    Ricardo Torres
    Wagner Meira
    Joel H. Saltz
    The VLDB Journal, 2014, 23 : 427 - 448
  • [26] MPtostream: an OpenMP compiler for CPU-GPU heterogeneous parallel systems
    Yang XueJun
    Tang Tao
    Wang GuiBin
    Jia Jia
    Xu XinHai
    SCIENCE CHINA-INFORMATION SCIENCES, 2012, 55 (09) : 1961 - 1971
  • [27] MPtostream: an OpenMP compiler for CPU-GPU heterogeneous parallel systems
    XueJun Yang
    Tao Tang
    GuiBin Wang
    Jia Jia
    XinHai Xu
    Science China Information Sciences, 2012, 55 : 1961 - 1971
  • [29] Benchmarking of High Performance Computing Clusters with Heterogeneous CPU/GPU Architecture
    Sukharev, Pavel V.
    Vasilyev, Nikolay P.
    Rovnyagin, Mikhail M.
    Durnov, Maxim A.
    PROCEEDINGS OF THE 2017 IEEE RUSSIA SECTION YOUNG RESEARCHERS IN ELECTRICAL AND ELECTRONIC ENGINEERING CONFERENCE (2017 ELCONRUS), 2017, : 574 - 577
  • [30] An Efficient Storage Format for Storing Configuration Interaction Sparse Matrices on CPU/GPU
    Mahmoud, Mohammed
    Hoffmann, Mark
    Reza, Hassan
    PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI), 2017, : 141 - 147