MOMC: Multi-Objective and Multi-Constrained Scheduling Algorithm of Many Tasks in Hadoop

被引:7
作者
Voicu, Cristiana [1 ]
Pop, Florin [1 ]
Dobre, Ciprian [1 ]
Xhafa, Fatos [2 ]
机构
[1] Univ Politehn Bucuresti, Bucharest, Romania
[2] Univ Politecnica Catalunya, Barcelona, Spain
来源
2014 NINTH INTERNATIONAL CONFERENCE ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING (3PGCIC) | 2014年
关键词
Task Scheduling; Hadoop; MapReduce; Big Data; Cloud Computing; MAPREDUCE;
D O I
10.1109/3PGCIC.2014.40
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Even though scheduling in a distributed system was debated for many years, the platforms and the job types are changing everyday. This is why we need special algorithms based on new applications requirements, especially when a application is deployed in a Cloud environment. One of the most important framework used for large-scale data processing in Clouds is Hadoop and its extensions. Hadoop framework comes with default algorithms like FIFO, Fair Scheduler or Capacity Scheduler, and Hadoop on Demand. These scheduling algorithms are focused on a different and single constraint. It is hard to satisfy multiple constraints and to have a lot of objectives in the same time. After summarizing the most common schedulers, showing the need of each one in the moment it appeared on the market, this paper presents MOMC, a multi-objective and multi-constrained scheduling algorithm of many tasks in Hadoop. MOMC implementation focuses on two objectives: avoiding resource contention and having an optimal workload of the cluster, and two constraints: deadline and budget. To compare the algorithms based on different metrics, we use Scheduling Load Simulator, which is integrated in Hadoop framework and helps the developers to spend less time on testing. As killer application that generate many tasks we have chosen processing task for the Million Song Dataset, which is a set of data contains metadata for one million commercially-available songs.
引用
收藏
页码:89 / 96
页数:8
相关论文
共 20 条
[1]  
[Anonymous], J COMPUTATIONAL INFO
[2]  
[Anonymous], 2013, MATH PROBL ENG, DOI DOI 10.1109/ICCIS.2013.384
[3]  
[Anonymous], 2008, 8 USENIX S OP SYST D
[4]  
Dai J., 2011, PROCEEDINGS OF THE 3, P24
[5]   Dynamic metrics for Java']Java [J].
Dufour, B ;
Driesen, K ;
Hendren, L ;
Verbrugge, C .
ACM SIGPLAN NOTICES, 2003, 38 (11) :149-168
[6]   Introduction and Analysis of Simulators of MapReduce [J].
Fan, Yuanquan ;
Wei, Wei ;
Gao, Yan ;
Wu, Weiguo .
TRUSTWORTHY COMPUTING AND SERVICES, 2014, 426 :345-350
[7]   SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters [J].
Gu, Rong ;
Yang, Xiaoliang ;
Yan, Jinshuang ;
Sun, Yuanhao ;
Wang, Bing ;
Yuan, Chunfeng ;
Huang, Yihua .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2014, 74 (03) :2166-2179
[8]  
Joshi ShrinivasB., 2012, Proceedings of the third joint WOSP/SIPEW International Con-ference on Performance Engineering, P241, DOI DOI 10.1145/2188286.2188323
[9]  
Kc K., 2010, Proceedings of the 2010 IEEE 2nd International Conference on Cloud Computing Technology and Science (CloudCom 2010), P388, DOI 10.1109/CloudCom.2010.97
[10]  
McFee B, 2012, P 21 INT C WORLD WID, P909, DOI 10.1145/2187980.2188222