Scheduling Independent Moldable Tasks on Multi-Cores with GPUs

被引：22

作者：

Bleuse, Raphael ^{[1
]}

Hunold, Sascha ^{[3
]}

Kedad-Sidhoum, Safia ^{[2
]}

Monna, Florence ^{[2
]}

Mounie, Gregory ^{[1
]}

Trystram, Denis ^{[1
]}

机构：

[1] Univ Grenoble Alpes, CNRS, INRIA, Grenoble INP,LIG, F-38000 Grenoble, France

[2] UPMC Univ Paris 06, Sorbonne Univ, UMR 7606, LIP6, F-75005 Paris, France

[3] TU Wien, Fac Informat, Inst Informat Syst, Favoritenstr 16-184-53, A-1040 Vienna, Austria

来源：

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | 2017年 / 28卷 / 09期

关键词：

Scheduling; heterogeneous computing; moldable tasks; dual approximation scheme; integer linear programming; APPROXIMATION ALGORITHMS; PARALLEL;

D O I：

10.1109/TPDS.2017.2675891

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We present a new approach for scheduling independent tasks on multiple CPUs and multiple GPUs. The tasks are assumed to be parallelizable on CPUs using the moldable model: the final number of cores allotted to a task can be decided and set by the scheduler. More precisely, we design an algorithm aiming at minimizing the makespan-the maximum completion time of all tasks-for this scheduling problem. The proposed algorithm combines a dual approximation scheme with a fast integer linear program (ILP). It determines both the partitioning of the tasks, i.e., whether a task should be mapped to CPUs or a GPU, and the number of CPUs allotted to a moldable task if mapped to the CPUs. A worst-case analysis shows that the algorithm has an approximation ratio of 3/2 + is an element of. Since the time complexity of the ILP-based algorithm could be non-polynomial, we also present a polynomial-time algorithm with an approximation ratio of 2 + is an element of. We complement the theoretical analysis of our two novel algorithms with a simulation study. In these simulations, we compare our algorithms to a modified version of the classical HEFT algorithm, which we adapted to handle moldable tasks. The simulation results show that our algorithm with the 3 2 + is an element of 3/2 + -approximation ratio produces significantly shorter schedules than the modified HEFT for most of the instances. In addition, our results provide evidence that our ILP-based algorithm can solve larger problem instances in a reasonable amount of time.

引用

页码：2689 / 2702

页数：14

共 32 条

[1]

Abe Y., 2012, P 2012 USENIX C POWE, V12, P10

[2]

Agullo E., 2011, Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2011), P932, DOI 10.1109/IPDPS.2011.90

[3]

Amdahl G. M., 1967, P APR 18 20 1967 SPR, P483, DOI [10.1145/1465482.1465560, DOI 10.1145/1465482.1465560]

[4] StarPU: a unified platform for task scheduling on heterogeneous multicore architectures [J].

Augonnet, Cedric ;

Thibault, Samuel ;

Namyst, Raymond ;

Wacrenier, Pierre-Andre .

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (02) :187-198

[5]

Bezanson J, 2014, CORR

[6]

Blayo E, 1999, LECT NOTES COMPUT SC, V1685, P303

[7] Scheduling independent tasks on multi-cores with GPU accelerators [J].

Bleuse, Raphael ;

Kedad-Sidhoum, Safia ;

Monna, Florence ;

Mounie, Gregory ;

Trystram, Denis .

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (06) :1625-1638

[8]

Bonifaci V., 2012, CORR

[9] PaRSRC: Exploiting Heterogeneity to Enhance Scalability [J].

Bosilca, George ;

Bouteiller, Aurelien ;

Danalis, Anthony ;

Faverge, Mathieu ;

Herault, Thomas ;

Dongarra, Jack J. .

COMPUTING IN SCIENCE & ENGINEERING, 2013, 15 (06) :36-45

[10]

Bougeret M, 2010, LECT NOTES COMPUT SC, V6271, P157, DOI 10.1007/978-3-642-15277-1_16

← 1 2 3 4 →