MapReduce scheduling algorithms in Hadoop: a systematic study

被引:0
作者
Soudabeh Hedayati
Neda Maleki
Tobias Olsson
Fredrik Ahlgren
Mahdi Seyednezhad
Kamal Berahmand
机构
[1] Islamic Azad University,Department of Computer Engineering, Science and Research Branch
[2] Linnaeus University,Applied IoT Lab, Department of Computer Science and Media Technology
[3] Florida Institute of Technology,School of Computing
[4] Queensland University of Technology (QUT),School of Computer Sciences, Science and Engineering Faculty
来源
Journal of Cloud Computing | / 12卷
关键词
Distributed systems; Resource allocation; Scheduling algorithms; Hadoop; MapReduce; Fair scheduling;
D O I
暂无
中图分类号
学科分类号
摘要
Hadoop is a framework for storing and processing huge volumes of data on clusters. It uses Hadoop Distributed File System (HDFS) for storing data and uses MapReduce to process that data. MapReduce is a parallel computing framework for processing large amounts of data on clusters. Scheduling is one of the most critical aspects of MapReduce. Scheduling in MapReduce is critical because it can have a significant impact on the performance and efficiency of the overall system. The goal of scheduling is to improve performance, minimize response times, and utilize resources efficiently. A systematic study of the existing scheduling algorithms is provided in this paper. Also, we provide a new classification of such schedulers and a review of each category. In addition, scheduling algorithms have been examined in terms of their main ideas, main objectives, advantages, and disadvantages.
引用
收藏
相关论文
共 99 条
  • [1] Assunção MD(2015)Big Data computing and clouds: Trends and future directions J Parallel Distributed Comput 79 3-15
  • [2] Hu H(2014)Toward scalable systems for big data analytics: A technology tutorial IEEE Access 2 652-687
  • [3] Chen CP(2014)Data-intensive applications, challenges, techniques and technologies: A survey on Big Data Inf Sci 275 314-347
  • [4] Zhang C-Y(2014)Big data: A survey Mobile Netw Appl 19 171-209
  • [5] Chen M(2008)MapReduce: simplified data processing on large clusters Commun ACM 51 107-113
  • [6] Mao S(2018)The optimization for recurring queries in big data analysis system with MapReduce Futur Gener Comput Syst 87 549-556
  • [7] Liu Y(2014)A comprehensive view of Hadoop MapReduce scheduling algorithms Int J Comput Netw Commun Secur 2 308-317
  • [8] Dean J(2022)Job scheduling for big data analytical applications in clouds: A taxonomy study Futur Gener Comput Syst 135 129-145
  • [9] Ghemawat S(2014)Big data processing using hadoop: survey on scheduling Int J Sci Res 3 272-277
  • [10] Zhang B(2018)IoTDeM: An IoT big data-oriented MapReduce performance prediction extended model in multiple edge clouds J Parallel Distributed Comput 118 316-327