Data Partitioning on Multicore and Multi-GPU Platforms Using Functional Performance Models

被引：52

作者：

Zhong, Ziming ^{[1
]}

Rychkov, Vladimir ^{[1
]}

Lastovetsky, Alexey ^{[1
]}

机构：

[1] Univ Coll Dublin, Sch Comp Sci & Informat, Dublin 4, Ireland

来源：

IEEE TRANSACTIONS ON COMPUTERS | 2015年 / 64卷 / 09期

基金：

爱尔兰科学基金会;

关键词：

HPC; heterogeneous computing; GPU-accelerated multicore system; performance modeling; data partitioning; HETEROGENEOUS MULTICORE; EQUATIONS; SYSTEMS;

D O I：

10.1109/TC.2014.2375202

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Heterogeneous multiprocessor systems, which are composed of a mix of processing elements, such as commodity multicore processors, graphics processing units (GPUs), and others, have been widely used in scientific computing community. Software applications incorporate the code designed and optimized for different types of processing elements in order to exploit the computing power of such heterogeneous computing systems. In this paper, we consider the problem of optimal distribution of the workload of data-parallel scientific applications between processing elements of such heterogeneous computing systems. We present a solution that uses functional performance models (FPMs) of processing elements and FPM-based data partitioning algorithms. Efficiency of this approach is demonstrated by experiments with parallel matrix multiplication and numerical simulation of lid-driven cavity flow on hybrid servers and clusters.

引用

页码：2506 / 2518

页数：13

共 29 条

[1] [Anonymous], 2008, IEEE International Symposium on Parallel and Distributed Processing
[2] Augonnet C, 2010, LECT NOTES COMPUT SC, V6043, P56
[3] Augonnet C, 2009, LECT NOTES COMPUT SC, V5704, P863, DOI 10.1007/978-3-642-03869-3_80
[4] Benkner S, 2012, LECT NOTES COMPUT SC, V7484, P614, DOI 10.1007/978-3-642-32820-6_61
[5] Chi-Keung Luk, 2009, Proceedings of the 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2009), P45
[6] A new parallel matrix multiplication algorithm on distributed-memory concurrent computers
Choi, J
[J]. HIGH PERFORMANCE COMPUTING ON THE INFORMATION SUPERHIGHWAY - HPC ASIA '97, PROCEEDINGS, 1997, : 224 - 229
[7] Clarke D, 2013, LECT NOTES COMPUT SC, V7979, P182, DOI 10.1007/978-3-642-39958-9_16
[8] Clarke D, 2012, LECT NOTES COMPUT SC, V7155, P450, DOI 10.1007/978-3-642-29737-3_50
[9] DYNAMIC LOAD BALANCING OF PARALLEL COMPUTATIONAL ITERATIVE ROUTINES ON HIGHLY HETEROGENEOUS HPC PLATFORMS
Clarke, David
Lastovetsky, Alexey
Rychkov, Vladimir
[J]. PARALLEL PROCESSING LETTERS, 2011, 21 (02) : 195 - 217
[10] Fatica M., 2009, Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, P46, DOI DOI 10.1145/1513895.1513901

← 1 2 3 →