MANY-TASK COMPUTING ON MANY-CORE ARCHITECTURES

被引：0

作者：

Valero-Lara, Pedro ^{[1
,2
]}

Nookala, Poornima ^{[3
]}

Pelayo, Fernando L. ^{[4
]}

Jansson, Johan ^{[2
,5
]}

Dimitropoulos, Serapheim ^{[3
]}

Raicu, Ioan ^{[3
]}

机构：

[1] Univ Manchester, Manchester M13 9PL, Lancs, England

[2] BCAM, Bilbao, Spain

[3] IIT, Chicago, IL 60616 USA

[4] UCLM, Albacete, Spain

[5] KTH Royal Inst Technol, Stockholm, Sweden

来源：

SCALABLE COMPUTING-PRACTICE AND EXPERIENCE | 2016年 / 17卷 / 01期

关键词：

Parallel Computing; Multi-Task Computing; Many-Core; GPU; Intel Xeon Phi; CUDA; OpenMP;

D O I：

暂无

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Many-Task Computing (MTC) is a common scenario for multiple parallel systems, such as cluster, grids, cloud and supercomputers, but it is not so popular in shared memory parallel processors. In this sense and given the spectacular growth in performance and in number of cores integrated in many-core architectures, the study of MTC on such architectures is becoming more and more relevant. In this paper, authors present what are those programming mechanisms to take advantages of such massively parallel features for the particular target of MTC. Also, the hardware features of the two dominant many-core platforms (NVIDIA's GPUs and Intel Xeon Phi) are also analyzed for our specific framework. Given the important differences in terms of hardware and software in our two many-core platforms, we have considered different strategies based on CUDA (for GPUs) and OpenMP (for Intel Xeon Phi). We carried out several test cases based on an appropriate and widely studied problem for benchmarking as matrix multiplication. Essentially, this study consisted of comparing the time consumed for computing in parallel several tasks one by one (the whole computational resources are used just to compute one task at a time) with the time consumed for computing in parallel the same set of tasks simultaneously (the whole computational resources are used for computing the set of tasks at very same time). Finally, we compared both software-hardware scenarios to identify the most relevant computer features in each of our many-core architectures.

引用

页码：33 / 46

页数：14

共 50 条

[41] Performance Optimisation of Smoothed Particle Hydrodynamics Algorithms for Multi/Many-Core Architectures
Baruffa, Fabio
Iapichino, Luigi
Hammer, Nicolay J.
Karakasis, Vasileios
[J]. 2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 381 - 388
[42] Fast and scalable quantum computing simulation on multi-core and many-core platforms
Ahmadzadeh, Armin
Sarbazi-Azad, Hamid
[J]. QUANTUM INFORMATION PROCESSING, 2023, 22 (05)
[43] Analysis of classic algorithms on highly-threaded many-core architectures
Ma, Lin
Chamberlain, Roger D.
Agrawal, Kunal
Tian, Chen
Hu, Ziang
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 82 : 528 - 543
[44] Generating Code and Memory Buffers to Reorganize Data on Many-core Architectures
Cudennec, Loic
Dubrulle, Paul
Galea, Francois
Goubier, Thierry
Sirdey, Renaud
[J]. 2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2014, 29 : 1123 - 1133
[45] Adaptive algorithm and tool flow for accelerating SystemC on many-core architectures
Reder, Simon
Roth, Christoph
Bucher, Harald
Sander, Oliver
Becker, Juergen
[J]. MICROPROCESSORS AND MICROSYSTEMS, 2015, 39 (08) : 1063 - 1075
[46] Optimizing the performance of reactive molecular dynamics simulations for many-core architectures
Aktulga, Hasan Metin
Knight, Chris
Coffman, Paul
O'Hearn, Kurt A.
Shan, Tzu-Ray
Jiang, Wei
[J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2019, 33 (02) : 304 - 321
[47] Mapping of option pricing algorithms onto heterogeneous many-core architectures
Shuai Zhang
Zhao Wang
Ying Peng
Bertil Schmidt
Weiguo Liu
[J]. The Journal of Supercomputing, 2017, 73 : 3715 - 3737
[48] Techniques for Enabling Highly Efficient Message Passing on Many-Core Architectures
Si, Min
Balaji, Pavan
Ishikawa, Yutaka
[J]. 2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 697 - 700
[49] Coarray-based load balancing on heterogeneous and many-core architectures
Cardellini, Valeria
Fanfarillo, Alessandro
Filippone, Salvatore
[J]. PARALLEL COMPUTING, 2017, 68 : 45 - 58
[50] Casper: An Asynchronous Progress Model for MPI RMA on Many-Core Architectures
Si, Min
Pena, Antonio J.
Hammond, Jeff
Balaji, Pavan
Takagi, Masamichi
Ishikawa, Yutaka
[J]. 2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2015, : 665 - 676

← 1 2 3 4 5 →