TMaR: a two-stage MapReduce scheduler for heterogeneous environments

被引：8

作者：

Maleki, Neda ^{[1
]}

Faragardi, Hamid Reza ^{[2
]}

Rahmani, Amir Masoud ^{[3
]}

Conti, Mauro ^{[4
]}

Lofstead, Jay ^{[5
]}

机构：

[1] Islamic Azad Univ, Dept Comp Engn, Sci & Res Branch, Tehran, Iran

[2] KTH Royal Inst Technol, Dept Comp Sci & Commun, Stockholm, Sweden

[3] Khazar Univ, Dept Comp Sci, Baku, Azerbaijan

[4] Univ Padua, Dept Math, Padua, Italy

[5] Sandia Natl Labs, POB 5800, Albuquerque, NM 87185 USA

来源：

HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES | 2020年 / 10卷 / 01期

关键词：

MapReduce; Hadoop; Heterogeneous systems; Scheduling; Performance; Shuffling; Power; Cloud computing; LOCALITY-AWARE; MAKESPAN; ALGORITHMS; SYSTEMS; TIME;

D O I：

10.1186/s13673-020-00247-5

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the context of MapReduce task scheduling, many algorithms mainly focus on the scheduling of Reduce tasks with the assumption that scheduling of Map tasks is already done. However, in the cloud deployments of MapReduce, the input data is located on remote storage which indicates the importance of the scheduling of Map tasks as well. In this paper, we propose a two-stage Map and Reduce task scheduler for heterogeneous environments, called TMaR. TMaR schedules Map and Reduce tasks on the servers that minimize the task finish time in each stage, respectively. We employ a dynamic partition binder for Reduce tasks in the Reduce stage to lighten the shuffling traffic. Indeed, TMaR minimizes the makespan of a batch of tasks in heterogeneous environments while considering the network traffic. The simulation results demonstrate that TMaR outperforms Hadoop-stock and Hadoop-A in terms of makespan and network traffic and achieves by an average of 29%, 36%, and 14% performance using Wordcount, Sort, and Grep benchmarks. Besides, the power reduction of TMaR is up to 12%.

引用

页数：26

共 44 条

[1] Agrawal P, 2012, ANN IEEE SYST CONF, P47
[2] MapReduce with communication overlap (MaRCO)
Ahmad, Faraz
Lee, Seyong
Thottethodi, Mithuna
Vijaykumar, T. N.
[J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (05) : 608 - 620
[3] Alrokayan M, 2014, 2014 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING IN EMERGING MARKETS (CCEM), P49
[4] [Anonymous], 2010, NSDI
[5] Surviving Failures in Bandwidth-Constrained Datacenters
Bodik, Peter
Menache, Ishai
Chowdhury, Mosharaf
Mani, Pradeepkumar
Maltz, David A.
Stoica, Ion
[J]. ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2012, 42 (04) : 431 - 442
[6] Braam PeterJ., 2002, LUSTRE SCALABLE HIGH
[7] SLA-aware energy-efficient scheduling scheme for Hadoop YARN
Cai, Xiaojun
Li, Feng
Li, Ping
Ju, Lei
Jia, Zhiping
[J]. JOURNAL OF SUPERCOMPUTING, 2017, 73 (08) : 3526 - 3546
[8] CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms
Calheiros, Rodrigo N.
Ranjan, Rajiv
Beloglazov, Anton
De Rose, Cesar A. F.
Buyya, Rajkumar
[J]. SOFTWARE-PRACTICE & EXPERIENCE, 2011, 41 (01) : 23 - 50
[9] Condie Tyson., 2010, NSDI, V10
[10] Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137

← 1 2 3 4 5 →