MapReduce scheduling algorithms: a review

被引:25
作者
Hashem, Ibrahim Abaker Targio [1 ,2 ]
Anuar, Nor Badrul [2 ]
Marjani, Mohsen [1 ]
Ahmed, Ejaz [3 ]
Chiroma, Haruna [4 ]
Firdaus, Ahmad [5 ]
Abdullah, Muhamad Taufik [6 ]
Alotaibi, Faiz [6 ]
Ali, Waleed Kamaleldin Mahmoud [2 ]
Yaqoob, Ibrar [2 ]
Gani, Abdullah [1 ]
机构
[1] Taylors Univ, Sch Comp & Informat Technol, Subang Jaya 47500, Malaysia
[2] Univ Malaya, Fac Comp Sci & Informat Technol, Kuala Lumpur, Malaysia
[3] Univ Malaya, Ctr Mobile Cloud Comp Res, Kuala Lumpur, Malaysia
[4] Fed Coll Educ Tech, Dept Comp Sci, Gombe, Nigeria
[5] Univ Malaysia Pahang, Fac Comp Syst & Software Engn, Kuantan 26300, Pahang, Malaysia
[6] Univ Putra Malaysia, Fac Comp Sci & Informat Technol, Serdang 43400, Selangor, Malaysia
关键词
Big data; Hadoop; MapReduce; Cloud computing; Scheduling algorithms; BIG DATA; RESOURCE-ALLOCATION; PERFORMANCE;
D O I
10.1007/s11227-018-2719-5
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recent trends in big data have shown that the amount of data continues to increase at an exponential rate. This trend has inspired many researchers over the past few years to explore new research direction of studies related to multiple areas of big data. The widespread popularity of big data processing platforms using MapReduce framework is the growing demand to further optimize their performance for various purposes. In particular, enhancing resources and jobs scheduling are becoming critical since they fundamentally determine whether the applications can achieve the performance goals in different use cases. Scheduling plays an important role in big data, mainly in reducing the execution time and cost of processing. This paper aims to survey the research undertaken in the field of scheduling in big data platforms. Moreover, this paper analyzed scheduling in MapReduce on two aspects: taxonomy and performance evaluation. The research progress in MapReduce scheduling algorithms is also discussed. The limitations of existing MapReduce scheduling algorithms and exploit future research opportunities are pointed out in the paper for easy identification by researchers. Our study can serve as the benchmark to expert researchers for proposing a novel MapReduce scheduling algorithm. However, for novice researchers, the study can be used as a starting point.
引用
收藏
页码:4915 / 4945
页数:31
相关论文
共 94 条
[1]  
Abad CL, 2011, 2011 IEEE INT C CLUS
[2]  
Ahmad F., 2012, ACM SIGARCH COMPUTER
[3]   A scalable Map Reduce tasks scheduling: A threading-based approach [J].
Althebyan, Qutaibah ;
AlQudah, Omar ;
Jararweh, Yaser ;
Yaseen, Qussai .
International Journal of Computational Science and Engineering, 2017, 14 (01) :44-54
[4]   MRA plus plus : Scheduling and data placement on MapReduce for heterogeneous environments [J].
Anjos, Julio C. S. ;
Carrera, Ivan ;
Kolberg, Wagner ;
Tibola, Andre Luis ;
Arantes, Luciana B. ;
Geyer, Claudio R. .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2015, 42 :22-35
[5]  
[Anonymous], 2011, 2011 IEEE 9 INT S PA
[6]  
[Anonymous], 2012, SURVEY IMPROVED SCHE
[7]  
[Anonymous], 2012, Proceedings of the 7th ACM European conference on Computer Systems, EuroSys '12, DOI [DOI 10.1145/2168836, DOI 10.1145/2168836.2168843]
[8]  
[Anonymous], 2018, P PRACT EXP ADV RES
[9]  
[Anonymous], 2014, SIGMETRICS
[10]  
[Anonymous], TECHNICAL REPORT