HybSMRP: a hybrid scheduling algorithm in Hadoop MapReduce framework

被引:15
作者
Gandomi, Abolfazl [1 ]
Reshadi, Midia [1 ]
Movaghar, Ali [2 ]
Khademzadeh, Ahmad [3 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, Sci & Res Branch, Tehran, Iran
[2] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
[3] Iran Telecommun Res Ctr, ITRC, Tehran, Iran
关键词
MapReduce; Scheduling; Hybrid algorithm; Data Locality; Dynamic priority; LOCALITY; PERFORMANCE;
D O I
10.1186/s40537-019-0253-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Due to the advent of new technologies, devices, and communication tools such as social networking sites, the amount of data produced by mankind is growing rapidly every year. Big data is a collection of large datasets that cannot be processed using traditional computing techniques. MapReduce has been introduced to solve large-data computational problems. It is specifically designed to run on commodity hardware, and it depends on dividing and conquering principles. Nowadays, the focus of researchers has shifted towards Hadoop MapReduce. One of the most outstanding characteristics of MapReduce is data locality-aware scheduling. Data locality-aware scheduler is a further efficient solution to optimize one or a set of performance metrics such as data locality, energy consumption and job completion time. Similar to all situations, time and scheduling are the most important aspects of the MapReduce framework. Therefore, many scheduling algorithms have been proposed in the past decades. The main ideas of these algorithms are increasing data locality rate and decreasing the response and completion time. In this paper, a new hybrid scheduling algorithm has been proposed, which uses dynamic priority and localization ID techniques and focuses on increasing data locality rate and decreasing completion time. The proposed algorithm was evaluated and compared with Hadoop default schedulers (FIFO, Fair), by running concurrent workloads consisting of Wordcount and Terasort benchmarks. The experimental results show that the proposed algorithm is faster than FIFO and Fair scheduling, achieves higher data locality rate and avoids wasting resources.
引用
收藏
页数:16
相关论文
共 50 条
[41]   High-Responsive Scheduling with MapReduce Performance Prediction on Hadoop YARN [J].
Liu, Yang ;
Zeng, Yukun ;
Piao, Xuefeng .
2016 IEEE 22ND INTERNATIONAL CONFERENCE ON EMBEDDED AND REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS (RTCSA), 2016, :238-247
[42]   A Task Scheduling Algorithm for Hadoop Platform [J].
Chen, Jilan ;
Wang, Dan ;
Zhao, Wenbing .
JOURNAL OF COMPUTERS, 2013, 8 (04) :929-936
[43]   Tolhit - A Scheduling Algorithm for Hadoop Cluster [J].
Brahmwar, M. ;
Kumar, M. ;
Sikka, G. .
TWELFTH INTERNATIONAL CONFERENCE ON COMMUNICATION NETWORKS, ICCN 2016 / TWELFTH INTERNATIONAL CONFERENCE ON DATA MINING AND WAREHOUSING, ICDMW 2016 / TWELFTH INTERNATIONAL CONFERENCE ON IMAGE AND SIGNAL PROCESSING, ICISP 2016, 2016, 89 :203-208
[44]   A MapReduce-Based Algorithm for Parallelizing Collusion Detection in Hadoop [J].
Mortazavi, Mahmood ;
Ladani, Behrouz Tork .
2015 7TH CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY (IKT), 2015,
[45]   Massive data MapReduce fingerprint discriminant algorithm Based on Hadoop [J].
Lu, Wei ;
Huang, Jun ;
Hong, Lin .
INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 :2655-+
[46]   MTSD: A task scheduling algorithm for MapReduce base on deadline constraints [J].
Tang, Zhuo ;
Zhou, Junqing ;
Li, Kenli ;
Li, Ruixuan .
2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, :2012-2018
[47]   A Cross-job Framework for MapReduce Scheduling [J].
Xiao, Xuejie ;
Tang, Jian ;
Chen, Zhenhua ;
Xu, Jielong ;
Wang, Chonggang .
2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014, :135-140
[48]   Research on Power System Harmonic Detection based on Hadoop MapReduce Framework [J].
Chen Wenjuan ;
Chen Shihua ;
Wang Zheqiang ;
Li Mengjie ;
Zhou Yuan .
2018 CHINA INTERNATIONAL CONFERENCE ON ELECTRICITY DISTRIBUTION (CICED), 2018, :431-435
[49]   Performance Comparison of Distributed Pattern Matching Algorithms on Hadoop MapReduce Framework [J].
Sona, C. P. ;
Mulerikkal, Jaison Paul .
MOBILE NETWORKS AND MANAGEMENT (MONAMI 2017), 2018, 235 :45-55
[50]   Dynamic processing slots scheduling for I/O intensive jobs of Hadoop MapReduce [J].
Kurazumi, Shiori ;
Tsumura, Tomoaki ;
Saito, Shoichi ;
Matsuo, Hiroshi .
2012 THIRD INTERNATIONAL CONFERENCE ON NETWORKING AND COMPUTING (ICNC 2012), 2012, :288-292