HybSMRP: a hybrid scheduling algorithm in Hadoop MapReduce framework

被引:15
作者
Gandomi, Abolfazl [1 ]
Reshadi, Midia [1 ]
Movaghar, Ali [2 ]
Khademzadeh, Ahmad [3 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, Sci & Res Branch, Tehran, Iran
[2] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
[3] Iran Telecommun Res Ctr, ITRC, Tehran, Iran
关键词
MapReduce; Scheduling; Hybrid algorithm; Data Locality; Dynamic priority; LOCALITY; PERFORMANCE;
D O I
10.1186/s40537-019-0253-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Due to the advent of new technologies, devices, and communication tools such as social networking sites, the amount of data produced by mankind is growing rapidly every year. Big data is a collection of large datasets that cannot be processed using traditional computing techniques. MapReduce has been introduced to solve large-data computational problems. It is specifically designed to run on commodity hardware, and it depends on dividing and conquering principles. Nowadays, the focus of researchers has shifted towards Hadoop MapReduce. One of the most outstanding characteristics of MapReduce is data locality-aware scheduling. Data locality-aware scheduler is a further efficient solution to optimize one or a set of performance metrics such as data locality, energy consumption and job completion time. Similar to all situations, time and scheduling are the most important aspects of the MapReduce framework. Therefore, many scheduling algorithms have been proposed in the past decades. The main ideas of these algorithms are increasing data locality rate and decreasing the response and completion time. In this paper, a new hybrid scheduling algorithm has been proposed, which uses dynamic priority and localization ID techniques and focuses on increasing data locality rate and decreasing completion time. The proposed algorithm was evaluated and compared with Hadoop default schedulers (FIFO, Fair), by running concurrent workloads consisting of Wordcount and Terasort benchmarks. The experimental results show that the proposed algorithm is faster than FIFO and Fair scheduling, achieves higher data locality rate and avoids wasting resources.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] A Review on Data locality in Hadoop MapReduce
    Sharma, Anil
    Singh, Gurwinder
    2018 FIFTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (IEEE PDGC), 2018, : 723 - 728
  • [22] Optimizing the Hadoop MapReduce Framework with high-performance storage devices
    Moon, Sangwhan
    Lee, Jaehwan
    Sun, Xiling
    Kee, Yang-suk
    JOURNAL OF SUPERCOMPUTING, 2015, 71 (09) : 3525 - 3548
  • [23] Optimizing the Hadoop MapReduce Framework with high-performance storage devices
    Sangwhan Moon
    Jaehwan Lee
    Xiling Sun
    Yang-suk Kee
    The Journal of Supercomputing, 2015, 71 : 3525 - 3548
  • [24] A Parallel Genetic Algorithms Framework based on Hadoop MapReduce
    Ferrucci, Filomena
    Salza, Pasquale
    Kechadi, M-Tahar
    Sarro, Federica
    30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 1664 - 1667
  • [25] A Hybrid Algorithm for Frequent Pattern Mining Using MapReduce Framework
    Chang, Hong-Yi
    Tzang, Yih-Jou
    Lin, Jia-Chi
    Hong, Zih-Huan
    Chi, Ting-Yun
    Huang, Chun-Yen
    2015 FIRST INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE THEORY, SYSTEMS AND APPLICATIONS (CCITSA 2015), 2015, : 19 - 22
  • [26] A MapReduce task scheduling algorithm for deadline constraints
    Tang, Zhuo
    Zhou, Junqing
    Li, Kenli
    Li, Ruixuan
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2013, 16 (04): : 651 - 662
  • [27] Scheduling algorithm based on prefetching in MapReduce clusters
    Sun, Mingming
    Zhuang, Hang
    Li, Changlong
    Lu, Kun
    Zhou, Xuehai
    APPLIED SOFT COMPUTING, 2016, 38 : 1109 - 1118
  • [28] Performance Enhancement of Hadoop MapReduce Framework for Analyzing BigData
    Prabhu, Swathi
    Rodrigues, Anisha P.
    Prasad, Guru M. S.
    Nagesh, H. R.
    2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES, 2015,
  • [29] Apache Hadoop-MapReduce on YARN framework latency
    El Yazidi, Abdelaziz
    Azizi, Mohamed Saad
    Benlachmi, Yassine
    Hasnaoui, Moulay Lahcen
    12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 803 - 808
  • [30] A MapReduce task scheduling algorithm for deadline constraints
    Zhuo Tang
    Junqing Zhou
    Kenli Li
    Ruixuan Li
    Cluster Computing, 2013, 16 : 651 - 662