A New Approach for Scheduling Tasks and/or Jobs in Big Data Cluster

被引:3
作者
Hadjar, Karim [1 ]
Jedidi, Ahmed [2 ]
机构
[1] Ahlia Univ, Dept Multimedia Sci, Manama, Bahrain
[2] Ahlia Univ, Dept Comp Engn, Manama, Bahrain
来源
2019 4TH MEC INTERNATIONAL CONFERENCE ON BIG DATA AND SMART CITY (ICBDSC) | 2019年
关键词
Big Data; Hadoop; HDFS; MapReduce; Job Scheduling; Spark; DataNodes; Cluster; NameNode;
D O I
10.1109/icbdsc.2019.8645613
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
400 hundred million tweets are sent each day, 4.75 billion multimedia content is shared every day on Facebook, and an estimated of hundreds of hours of video are uploaded to YouTube every minute. Moreover, also IOT devices (RFID and WIFI wearable devices) are generating a huge number of data per seconds. During the last two years, the amount of data that has been created is about 90% of the whole data created so far. All these facts require clusters of computers with high specs in order to treat them. Knowing that the prices of computers are continuously dropping from year to another, almost all companies have started their Big Data projects. The return on investment (ROI) of such a project is beneficial to companies in terms of business. Since the advent of Big Data, a lot improvement is been done in order to optimize the usage of the resources (especially RAM) and to reduce the required amount of time needed for running Big Data projects. Still, effort needs to be done for the scheduler for efficiently scheduling the tasks inside the DataNodes of the Big Data Cluster. In this paper, we propose a new approach for scheduling tasks and/or jobs in Big Data Cluster in which mainly focus on optimizing the assignment of tasks to the data nodes by the NameNode. Prominent obtained results proved that our task scheduler outperforms the traditional task scheduler: FIFO Scheduler and Capacity Scheduler available in Big Data open source distributions such as Cloudera [19] and HortonWorks [20].
引用
收藏
页码:191 / 194
页数:4
相关论文
共 20 条
  • [1] [Anonymous], 2014, IT CONV SEC ICITCS 2
  • [2] [Anonymous], J COMPUTATIONAL INFO
  • [3] [Anonymous], 2018, Hadoop Fair Scheduler
  • [4] [Anonymous], 2018, 4 VS BIG DATA
  • [5] [Anonymous], 2017, CAPACITY SCHEDULER H
  • [6] [Anonymous], 2018, Hadoop Capacity Scheduler
  • [7] Big Data: A Survey
    Chen, Min
    Mao, Shiwen
    Liu, Yunhao
    [J]. MOBILE NETWORKS & APPLICATIONS, 2014, 19 (02) : 171 - 209
  • [8] Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
  • [9] He C, 2011, INT CONF ACOUST SPEE, P3540
  • [10] IDC, 2018, DIG UN OPP