Context-aware scheduling in MapReduce: a compact review

被引:9
作者
Idris, Muhammad [1 ]
Hussain, Shujaat [1 ]
Ali, Maqbool [1 ]
Abdulali, Arsen [1 ]
Siddiqi, Muhammad Hameed [1 ]
Kang, Byeong Ho [1 ]
Lee, Sungyoung [1 ]
机构
[1] Kyung Hee Univ, Dept Comp Engn, Ubiquitous Comp Lab, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
scheduling; task scheduling; job scheduling; data-intensive computing; big data; cloud; REAL-TIME;
D O I
10.1002/cpe.3578
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
It is a fact that the attention of research community in computer science, business executives, and decision makers is drastically drawn by big data. As the volume of data becomes bigger, it needs performance-oriented data-intensive processing frameworks such as MapReduce, which can scale computation on large commodity clusters. Hadoop MapReduce processes data in Hadoop Distributed File System as jobs scheduled according to YARN fair scheduler and capacity scheduler. However, with advancement and dynamic changes in hardware and operating environments, the performance of clusters is greatly affected. Various efforts in literature have been made to address the issues of heterogeneity (i.e., clusters consisting of virtual machines and machines with different hardware), network communication, data locality, better resource utilization, and run-time scheduling. In this paper, we present a survey to discuss various research efforts made so far to improve Hadoop MapReduce scheduling. We classify scheduling algorithms and techniques proposed in the literature so far based on their addressing areas and present a taxonomy. Furthermore, we also discuss various aspects of open issues and challenges in the scheduling of MapReduce to improve its performance. Copyright (c) 2015 John Wiley & Sons, Ltd.
引用
收藏
页码:5332 / 5349
页数:18
相关论文
共 67 条
[1]   MRA plus plus : Scheduling and data placement on MapReduce for heterogeneous environments [J].
Anjos, Julio C. S. ;
Carrera, Ivan ;
Kolberg, Wagner ;
Tibola, Andre Luis ;
Arantes, Luciana B. ;
Geyer, Claudio R. .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2015, 42 :22-35
[2]  
[Anonymous], 2009, Tech. Rep., Technical Report UCB/EECS-2009-55
[3]  
[Anonymous], P USENIX ANN TECHN C
[4]  
[Anonymous], HPC ADMIN MAGAZINE
[5]  
Apache, 2013, CAP SCHED
[6]  
Appuswamy Raja., 2013, P 4 ANN S CLOUD COMP, P20
[7]  
Ari I, EUROPAR 2013 PARALLE, P395
[8]   A View of Cloud Computing [J].
Armbrust, Michael ;
Fox, Armando ;
Griffith, Rean ;
Joseph, Anthony D. ;
Katz, Randy ;
Konwinski, Andy ;
Lee, Gunho ;
Patterson, David ;
Rabkin, Ariel ;
Stoica, Ion ;
Zaharia, Matei .
COMMUNICATIONS OF THE ACM, 2010, 53 (04) :50-58
[9]   Locality and Network-Aware Reduce Task Scheduling for Data-Intensive Applications [J].
Arslan, Engin ;
Shekhar, Mrigank ;
Kosar, Tevfik .
2014 5TH INTERNATIONAL WORKSHOP ON DATA-INTENSIVE COMPUTING IN THE CLOUDS (DATACLOUD), 2014, :17-24
[10]  
Azeez A, 2010, 2010 IEEE 3 INT C CL