MapReduce scheduling algorithms in Hadoop: a systematic study

被引:0
作者
Soudabeh Hedayati
Neda Maleki
Tobias Olsson
Fredrik Ahlgren
Mahdi Seyednezhad
Kamal Berahmand
机构
[1] Islamic Azad University,Department of Computer Engineering, Science and Research Branch
[2] Linnaeus University,Applied IoT Lab, Department of Computer Science and Media Technology
[3] Florida Institute of Technology,School of Computing
[4] Queensland University of Technology (QUT),School of Computer Sciences, Science and Engineering Faculty
来源
Journal of Cloud Computing | / 12卷
关键词
Distributed systems; Resource allocation; Scheduling algorithms; Hadoop; MapReduce; Fair scheduling;
D O I
暂无
中图分类号
学科分类号
摘要
Hadoop is a framework for storing and processing huge volumes of data on clusters. It uses Hadoop Distributed File System (HDFS) for storing data and uses MapReduce to process that data. MapReduce is a parallel computing framework for processing large amounts of data on clusters. Scheduling is one of the most critical aspects of MapReduce. Scheduling in MapReduce is critical because it can have a significant impact on the performance and efficiency of the overall system. The goal of scheduling is to improve performance, minimize response times, and utilize resources efficiently. A systematic study of the existing scheduling algorithms is provided in this paper. Also, we provide a new classification of such schedulers and a review of each category. In addition, scheduling algorithms have been examined in terms of their main ideas, main objectives, advantages, and disadvantages.
引用
收藏
相关论文
共 99 条
  • [11] Wang X(2016)Analyzing performance of Apache Tez and MapReduce with hadoop multinode cluster on Amazon cloud J Big Data 3 1-10
  • [12] Zheng Z(2015)BeTL: MapReduce checkpoint tactics beneath the task level IEEE Trans Serv Comput 9 84-95
  • [13] Pakize SR(2015)Guidelines for conducting systematic mapping studies in software engineering: An update Inf Softw Technol 64 1-18
  • [14] Kang Y(2021)A classification of Hadoop job schedulers based on performance optimization approaches Clust Comput 24 3381-3403
  • [15] Pan L(2019)Hadoop mapreduce job scheduling algorithms survey and use cases Mod Appl Sci 13 1-38
  • [16] Liu S(2020)MapReduce scheduling algorithms: a review J Supercomput 76 4915-4945
  • [17] Bhosale HS(2017)Task scheduling in big data platforms: a systematic literature review J Syst Softw 134 170-189
  • [18] Gadekar DP(2017)MapReduce and its applications, challenges, and architecture: a comprehensive review and directions for future research J Grid Comput 15 295-321
  • [19] Lu Z(2016)A survey on job scheduling in big data Cybern Inf Technol 16 35-51
  • [20] Singh R(2016)MapReduce: Review and open challenges Scientometrics 109 389-422