Tails in the cloud: a survey and taxonomy of straggler management within large-scale cloud data centres

被引:0
作者
Sukhpal Singh Gill
Xue Ouyang
Peter Garraghan
机构
[1] Queen Mary University of London,School of Electronic Engineering and Computer Science
[2] National University of Defense Technology,School of Electronic Sciences
[3] Lancaster University,School of Computing and Communications
来源
The Journal of Supercomputing | 2020年 / 76卷
关键词
Computing; Stragglers; Cloud computing; Straggler management; Distributed systems; Cloud data centres;
D O I
暂无
中图分类号
学科分类号
摘要
Cloud computing systems are splitting compute- and data-intensive jobs into smaller tasks to execute them in a parallel manner using clusters to improve execution time. However, such systems at increasing scale are exposed to stragglers, whereby abnormally slow running tasks executing within a job substantially affect job performance completion. Such stragglers are a direct threat towards attaining fast execution of data-intensive jobs within cloud computing. Researchers have proposed an assortment of different mechanisms, frameworks, and management techniques to detect and mitigate stragglers both proactively and reactively. In this paper, we present a comprehensive review of straggler management techniques within large-scale cloud data centres. We provide a detailed taxonomy of straggler causes, as well as proposed management and mitigation techniques based on straggler characteristics and properties. From this systematic review, we outline several outstanding challenges and potential directions of possible future work for straggler research.
引用
收藏
页码:10050 / 10089
页数:39
相关论文
共 125 条
[11]  
Buyya R(2018)A taxonomy and future directions for sustainable cloud computing: 360 degree view ACM Comput Surv (CSUR) 51 104-148
[12]  
Dean J(2018)Proxy responses by FPGA-based switch for MapReduce stragglers IEICE Trans Inf Syst 101 2258-1069
[13]  
Barroso LA(2019)Mitigating stragglers to avoid QoS violation for time-critical applications through dynamic server blacklisting Future Gener Comput Syst 101 831-306
[14]  
Aktas MF(2019)A new framework for evaluating straggler detection mechanisms in MapReduce ACM Trans Model Perform Eval Comput Syst (TOMPECS) 4 14-113
[15]  
Peng P(2010)Reining in the outliers in map-reduce clusters using Mantri Osdi 10 24-231
[16]  
Soljanin E(2019)Transformative effects of IoT, blockchain and artificial intelligence on cloud computing: evolution, vision, trends and open challenges Internet of Things 8 100118-18
[17]  
Wang D(2014)A comprehensive review of straggler handling algorithms for MapReduce framework Int J Grid Distrib Comput 7 139-21
[18]  
Joshi G(2016)Review and analysis of straggler handling techniques Int J Comput Sci Inf Technol 7 2270-1933
[19]  
Wornell G(2016)Cloud resource provisioning: survey, status and future research directions Knowl Inf Syst 49 1005-21
[20]  
Ananthanarayanan G(2011)Straggler identification in round-trip data streams via Newton’s identities and invertible Bloom filters IEEE Trans Knowl Data Eng 23 297-150