MapReduce scheduling algorithms in Hadoop: a systematic study

被引:15
作者
Hedayati, Soudabeh [1 ]
Maleki, Neda [2 ]
Olsson, Tobias [2 ]
Ahlgren, Fredrik [2 ]
Seyednezhad, Mahdi [3 ]
Berahmand, Kamal [4 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, Sci & Res Branch, Tehran, Iran
[2] Linnaeus Univ, Appl IoT Lab, Dept Comp Sci & Media Technol, Kalmar, Sweden
[3] Florida Inst Technol, Sch Comp, Melbourne, FL USA
[4] Queensland Univ Technol QUT, Sci & Engn Fac, Sch Comp Sci, Brisbane, Qld, Australia
来源
JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS | 2023年 / 12卷 / 01期
关键词
Distributed systems; Resource allocation; Scheduling algorithms; Hadoop; MapReduce; Fair scheduling; BIG DATA; MODEL; JOBS;
D O I
10.1186/s13677-023-00520-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hadoop is a framework for storing and processing huge volumes of data on clusters. It uses Hadoop Distributed File System (HDFS) for storing data and uses MapReduce to process that data. MapReduce is a parallel computing framework for processing large amounts of data on clusters. Scheduling is one of the most critical aspects of MapReduce. Scheduling in MapReduce is critical because it can have a significant impact on the performance and efficiency of the overall system. The goal of scheduling is to improve performance, minimize response times, and utilize resources efficiently. A systematic study of the existing scheduling algorithms is provided in this paper. Also, we provide a new classification of such schedulers and a review of each category. In addition, scheduling algorithms have been examined in terms of their main ideas, main objectives, advantages, and disadvantages.
引用
收藏
页数:30
相关论文
共 95 条
[1]   Energy-Aware Heuristic Scheduling Using Bin Packing MapReduce Scheduler for Heterogeneous Workloads Performance in Big Data [J].
Aarthee, S. ;
Prabakaran, R. .
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (02) :1891-1905
[2]  
Abdallat AA, 2019, Modern Applied Science, V13, P38, DOI [10.5539/mas.v13n7p38, 10.5539/mas.v13n7p38, DOI 10.5539/MAS.V13N7P38]
[3]  
[Anonymous], 2011, 2011 IEEE 9 INT S PA
[4]   Big Data computing and clouds: Trends and future directions [J].
Assuncao, Marcos D. ;
Calheiros, Rodrigo N. ;
Bianchi, Silvia ;
Netto, Marco A. S. ;
Buyya, Rajkumar .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2015, 79-80 :3-15
[5]  
Bakni N-E, 2021, P 4 INT C NETW INF S
[6]  
Bhosale HS., 2014, Int J Sci Res, V3, P272
[7]   Data-intensive applications, challenges, techniques and technologies: A survey on Big Data [J].
Chen, C. L. Philip ;
Zhang, Chun-Yang .
INFORMATION SCIENCES, 2014, 275 :314-347
[8]   MapReduce Scheduling for Deadline-Constrained Jobs in Heterogeneous Cloud Computing Systems [J].
Chen, Chien-Hung ;
Lin, Jenn-Wei ;
Kuo, Sy-Yen .
IEEE TRANSACTIONS ON CLOUD COMPUTING, 2018, 6 (01) :127-140
[9]   Deadline-Constrained MapReduce Scheduling Based on Graph Modelling [J].
Chen, Chien-Hung ;
Lin, Jenn-Wei ;
Kuo, Sy-Yen .
2014 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2014, :417-424
[10]   CRESP: Towards Optimal Resource Provisioning for MapReduce Computing in Public Clouds [J].
Chen, Keke ;
Powers, James ;
Guo, Shumin ;
Tian, Fengguang .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2014, 25 (06) :1403-1412