Improving MapReduce Performance with Progress and Feedback based Speculative Execution

被引：7

作者：

Ibrahim, Ibrahim Adel ^{[1
]}

Bassiouni, Mostafa ^{[2
]}

机构：

[1] Univ Cent Florida, Dept Elect & Comp Engn, Orlando, FL 32816 USA

[2] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD) | 2017年

关键词：

Cloud Computing; MapReduce: Data-Intensive Computing; Parallel and Distributed Processing; Straggler Identification; Speculative Execution;

D O I：

10.1109/SmartCloud.2017.25

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Task stragglers dramatically impede parallel job execution of data-intensive computing in Cloud Datacenters Due to the uneven distribution of input data resulted from heterogeneous data nodes, resource contention situations, and network configurations, it causes delay failures due to the violation of job completion time. However, data-intensive computing frameworks, such as MapReduce or Hadoop, employ a mechanism called speculative execution to deal with the straggler issue, speculative execution provide limited effectiveness because in many cases straggler identification occurs too late within a job lifecycle. Identifying the straggler and the timing of identifying it is very important for Straggler mitigation in Data-intensive cloud computing. Speculative execution method is a widely adopted as a straggler identification and mitigation scheme but it has certain inherent limitations. In this paper, we strive to make Hadoop more efficient in cloud environments. We present Progress and Feedback based Speculative Execution Algorithm (PFSE), a new Straggler identification scheme to identify the straggler MapReduce tasks based on the feedback information received from completed tasks beside the progress of the currently processing task, our extensive simulation shows that PFSE can outperform the dynamic scheduling techniques like Self-Learning MapReduce scheduler (SLM) and LATE. PFSE can assist in enhancing straggler Identification and mitigation for tolerating late-timing failures within data intensive cloud computing.

引用

页码：120 / 125

页数：6

共 20 条

[1]

[Anonymous], 2013, 10 USENIX S NETWORKE

[2]

[Anonymous], 2010, P 9 USENIX C OP SYST

[3]

[Anonymous], 2008, 8 USENIX S OP SYST D

[4] Improving MapReduce Performance Using Smart Speculative Execution Strategy [J].

Chen, Qi ;

Liu, Cheng ;

Xiao, Zhen .

IEEE TRANSACTIONS ON COMPUTERS, 2014, 63 (04) :954-967

[5] Improving Load Balance for Data-Intensive Computing on Cloud Platforms [J].

Dai, Wei ;

Ibrahim, Ibrahim ;

Bassiouni, Mostafa .

2016 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD), 2016, :140-145

[6] A New Replica Placement Policy for Hadoop Distributed File System [J].

Dai, Wei ;

Ibrahim, Ibrahim ;

Bassiouni, Mostafa .

2016 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY), IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC), AND IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2016, :262-267

[7] An improved task assignment scheme for Hadoop running in the clouds [J].