MapReduce scheduling algorithms in Hadoop: a systematic study

被引：0

作者：

Soudabeh Hedayati

Neda Maleki

Tobias Olsson

Fredrik Ahlgren

Mahdi Seyednezhad

Kamal Berahmand

机构：

[1] Islamic Azad University,Department of Computer Engineering, Science and Research Branch

[2] Linnaeus University,Applied IoT Lab, Department of Computer Science and Media Technology

[3] Florida Institute of Technology,School of Computing

[4] Queensland University of Technology (QUT),School of Computer Sciences, Science and Engineering Faculty

来源：

Journal of Cloud Computing | / 12卷

关键词：

Distributed systems; Resource allocation; Scheduling algorithms; Hadoop; MapReduce; Fair scheduling;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Hadoop is a framework for storing and processing huge volumes of data on clusters. It uses Hadoop Distributed File System (HDFS) for storing data and uses MapReduce to process that data. MapReduce is a parallel computing framework for processing large amounts of data on clusters. Scheduling is one of the most critical aspects of MapReduce. Scheduling in MapReduce is critical because it can have a significant impact on the performance and efficiency of the overall system. The goal of scheduling is to improve performance, minimize response times, and utilize resources efficiently. A systematic study of the existing scheduling algorithms is provided in this paper. Also, we provide a new classification of such schedulers and a review of each category. In addition, scheduling algorithms have been examined in terms of their main ideas, main objectives, advantages, and disadvantages.

引用

共 99 条

[1] Assunção MD(2015)Big Data computing and clouds: Trends and future directions J Parallel Distributed Comput 79 3-15
[2] Hu H(2014)Toward scalable systems for big data analytics: A technology tutorial IEEE Access 2 652-687
[3] Chen CP(2014)Data-intensive applications, challenges, techniques and technologies: A survey on Big Data Inf Sci 275 314-347
[4] Zhang C-Y(2014)Big data: A survey Mobile Netw Appl 19 171-209
[5] Chen M(2008)MapReduce: simplified data processing on large clusters Commun ACM 51 107-113
[6] Mao S(2018)The optimization for recurring queries in big data analysis system with MapReduce Futur Gener Comput Syst 87 549-556
[7] Liu Y(2014)A comprehensive view of Hadoop MapReduce scheduling algorithms Int J Comput Netw Commun Secur 2 308-317
[8] Dean J(2022)Job scheduling for big data analytical applications in clouds: A taxonomy study Futur Gener Comput Syst 135 129-145
[9] Ghemawat S(2014)Big data processing using hadoop: survey on scheduling Int J Sci Res 3 272-277
[10] Zhang B(2018)IoTDeM: An IoT big data-oriented MapReduce performance prediction extended model in multiple edge clouds J Parallel Distributed Comput 118 316-327

← 1 2 3 4 5 6 7 8 9 10 →