Heterogeneous Sparse Matrix Computations on Hybrid GPU/CPU Platforms

被引：2

作者：

Cardellini, Valeria ^{[1
]}

Fanfarillo, Alessandro ^{[1
]}

Filippone, Salvatore ^{[2
]}

机构：

[1] Univ Roma Tor Vergata, Dipartimento Ingn Civile & Ingn Informat, Rome, Italy

[2] Univ Roma Tor Vergata, Dipartimento Ingn Ind, Rome, Italy

来源：

PARALLEL COMPUTING: ACCELERATING COMPUTATIONAL SCIENCE AND ENGINEERING (CSE) | 2014年 / 25卷

关键词：

Sparse Matrix Computations; Design Patterns; CUDA; GPGPU;

D O I：

10.3233/978-1-61499-381-0-203

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Hybrid GPU/CPU clusters are becoming very popular in the scientific computing community, as attested by the number of such systems present in the Top 500 list. In this paper, we address one of the key algorithms for scientific applications: the computation of sparse matrix-vector products that lies at the heart of iterative solvers for sparse linear systems. We detail how design patterns for sparse matrix computations enable us to easily adapt to such a heterogeneous GPU/CPU platform using several sparse matrix formats in order to achieve best performance; then, we analyze static load balancing strategies for devising a suitable data decomposition and propose our approach. We discuss our experience in using different sparse matrix formats and data partitioning algorithms with a number of computational experiments executed on three different hybrid GPU/CPU platforms.

引用

页码：203 / 212

页数：10

共 50 条

[41] Parallel Implementation of Sieving Algorithm on Heterogeneous CPU-GPU Computing Architectures
Wu, Mengsi
Li, Pei
Chen, Jiageng
Yao, Shixiong
INFORMATION SECURITY PRACTICE AND EXPERIENCE, ISPEC 2024, 2025, 15053 : 258 - 272
[42] Stencil computations on heterogeneous platforms for the Jacobi method: GPUs versus Cell BE
José M. Cecilia
José L. Abellán
Juan Fernández
Manuel E. Acacio
José M. García
Manuel Ujaldón
The Journal of Supercomputing, 2012, 62 : 787 - 803
[43] A Distributed PTX Compilation and Execution System on Hybrid CPU/GPU Clusters
Liang, Tyng-Yeu
Li, Hung-Fu
Chen, Bi-Shing
INTELLIGENT SYSTEMS AND APPLICATIONS (ICS 2014), 2015, 274 : 1355 - 1364
[44] Optimizing tensor contraction expressions for hybrid CPU-GPU execution
Wenjing Ma
Sriram Krishnamoorthy
Oreste Villa
Karol Kowalski
Gagan Agrawal
Cluster Computing, 2013, 16 : 131 - 155
[45] Stencil computations on heterogeneous platforms for the Jacobi method: GPUs versus Cell BE
Cecilia, Jose M.
Abellan, Jose L.
Fernandez, Juan
Acacio, Manuel E.
Garcia, Jose M.
Ujaldon, Manuel
JOURNAL OF SUPERCOMPUTING, 2012, 62 (02): : 787 - 803
[46] Optimizing tensor contraction expressions for hybrid CPU-GPU execution
Ma, Wenjing
Krishnamoorthy, Sriram
Villa, Oreste
Kowalski, Karol
Agrawal, Gagan
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2013, 16 (01): : 131 - 155
[47] Using Criticality of GPU Accesses in Memory Management for CPU-GPU Heterogeneous Multi-Core Processors
Rai, Siddharth
Chaudhuri, Mainak
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2017, 16
[48] Automatic data structure selection and transformation for sparse matrix computations
Bik, AJC
Wijshoff, HAG
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1996, 7 (02) : 109 - 126
[49] GPU-In-Hadoop: Enabling MapReduce Across Distributed Heterogeneous Platforms
Zhu, Jie
Li, Juanjuan
Hardesty, Erikson
Jiang, Hai
Li, Kuan-Ching
2014 IEEE/ACIS 13TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2014, : 315 - 320
[50] Improving Reliability of Soft Real-Time Embedded Systems on Integrated CPU and GPU Platforms
Ma, Yue
Zhou, Junlong
Chantem, Thidapat
Dick, Robert P.
Wang, Shige
Hu, Xiaobo Sharon
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (10) : 2218 - 2229

← 1 2 3 4 5 →