RETRACTED ARTICLE: Detecting straggler MapReduce tasks in big data processing infrastructure by neural network

被引:0
作者
Amir Javadpour
Guojun Wang
Samira Rezaei
Kuan-Ching Li
机构
[1] Guangzhou University,School of Computer Science
[2] University of Groningen,Bernoulli Institute for Mathematics and Computer Science
[3] Providence University,Department of Computer Science and Information Engineering
来源
The Journal of Supercomputing | 2020年 / 76卷
关键词
Hadoop; Speculative execution; Straggler tasks; MapReduce; Artificial neural network;
D O I
暂无
中图分类号
学科分类号
摘要
Straggler task detection is one of the main challenges in applying MapReduce for parallelizing and distributing large-scale data processing. It is defined as detecting running tasks on weak nodes. Considering two stages in the Map phase (copy, combine) and three stages of Reduce (shuffle, sort and reduce), the total execution time is the total sum of the execution time of these five stages. Estimating the correct execution time in each stage that results in correct total execution time is the primary purpose of this paper. The proposed method is based on the application of a backpropagation neural network on the Hadoop for the detection of straggler tasks, to estimate the remaining execution time of tasks that is very important in straggler task detection. Results achieved have been compared with popular algorithms in this domain such as LATE, ESAMR and the real remaining time for WordCount and Sort benchmarks, and shown able to detect straggler tasks and estimate execution time accurately. Besides, it supports to accelerate task execution time.
引用
收藏
页码:6969 / 6993
页数:24
相关论文
共 53 条
[1]  
Kaur N(2017)An energy-efficient architecture for the Internet of Things (IoT) IEEE Syst J 11 796-805
[2]  
Sood SK(2016)Improving brain magnetic resonance image (MRI) segmentation via a novel algorithm based on genetic and regional growth J Biomed Phys Eng 6 95-108
[3]  
Javadpour A(2017)Achieving reliable and secure services in cloud computing environments Comput Electr Eng 59 153-164
[4]  
Mohammadi A(2019)Improving resources management in network virtualization by utilizing a software-based network Wirel Pers Commun 106 505-519
[5]  
Liu Q(2017)Sustainable and efficient data collection from WSNs to cloud IEEE Trans Sustain Comput PP 1-14:23
[6]  
Wang G(2019)A new framework for evaluating straggler detection mechanisms in mapreduce ACM Trans Model Perform Eval Comput Syst 4 14:1-25
[7]  
Liu X(2014)A comprehensive view of hadoop research—a systematic literature review” J Netw Comput Appl 46 1-15:22
[8]  
Peng T(2018)Adaptive speculation for efficient internetware application execution in clouds ACM Trans Internet Technol 18 15:1-11:44
[9]  
Wu J(2013)The family of mapreduce and large-scale data processing systems ACM Comput Surv 46 11:1-1701
[10]  
Javadpour A(2016)MrHeter: improving MapReduce performance in heterogeneous environments Clust Comput 19 1691-46