Tails in the cloud: a survey and taxonomy of straggler management within large-scale cloud data centres

被引:17
作者
Gill, Sukhpal Singh [1 ]
Ouyang, Xue [2 ]
Garraghan, Peter [3 ]
机构
[1] Queen Mary Univ London, Sch Elect Engn & Comp Sci, London, England
[2] Natl Univ Def Technol, Sch Elect Sci, Changsha, Peoples R China
[3] Univ Lancaster, Sch Comp & Commun, Lancaster, England
基金
英国工程与自然科学研究理事会;
关键词
Computing; Stragglers; Cloud computing; Straggler management; Distributed systems; Cloud data centres;
D O I
10.1007/s11227-020-03241-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud computing systems are splitting compute- and data-intensive jobs into smaller tasks to execute them in a parallel manner using clusters to improve execution time. However, such systems at increasing scale are exposed to stragglers, whereby abnormally slow running tasks executing within a job substantially affect job performance completion. Such stragglers are a direct threat towards attaining fast execution of data-intensive jobs within cloud computing. Researchers have proposed an assortment of different mechanisms, frameworks, and management techniques to detect and mitigate stragglers both proactively and reactively. In this paper, we present a comprehensive review of straggler management techniques within large-scale cloud data centres. We provide a detailed taxonomy of straggler causes, as well as proposed management and mitigation techniques based on straggler characteristics and properties. From this systematic review, we outline several outstanding challenges and potential directions of possible future work for straggler research.
引用
收藏
页码:10050 / 10089
页数:40
相关论文
共 50 条
  • [21] Low-power task scheduling algorithm for large-scale cloud data centers
    Xiaolong Xu
    Jiaxing Wu
    Geng Yang
    Ruchuan Wang
    Journal of Systems Engineering and Electronics, 2013, 24 (05) : 870 - 878
  • [22] Low-power task scheduling algorithm for large-scale cloud data centers
    Xu, Xiaolong
    Wu, Jiaxing
    Yang, Geng
    Wang, Ruchuan
    JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2013, 24 (05) : 870 - 878
  • [23] Silhouette: Efficient Cloud Configuration Exploration for Large-Scale Analytics
    Chen, Yanjiao
    Lin, Long
    Li, Baochun
    Wang, Qian
    Zhang, Qian
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (08) : 2049 - 2061
  • [24] A Large-Scale Distributed Sorting Algorithm Based on Cloud Computing
    Pang, Na
    Zhu, Dali
    Fan, Zheming
    Rong, Wenjing
    Feng, Weimiao
    APPLICATIONS AND TECHNIQUES IN INFORMATION SECURITY, ATIS 2015, 2015, 557 : 226 - 237
  • [25] A Cloud Computing Capability Model for Large-Scale Semantic Annotation
    Adedugbe, Oluwasegun
    Benkhelifa, Elhadj
    Bani-Hani, Anoud
    2020 13TH INTERNATIONAL CONFERENCE ON DEVELOPMENTS IN ESYSTEMS ENGINEERING (DESE 2020), 2020, : 335 - 340
  • [26] The Elastic Cloud Platform for the Large-Scale Domain Name System
    Li, Yunchun
    Lv, Cheng
    PRACTICAL APPLICATIONS OF INTELLIGENT SYSTEMS, ISKE 2013, 2014, 279 : 305 - 316
  • [27] DriftInsight: Detecting Anomalous Behaviors in Large-scale Cloud Platform
    Meng, Fan Jing
    Zhang, Xiao
    Chen, Pengfei
    Xu, Jingming
    2017 IEEE 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2017, : 230 - 237
  • [28] Improving Failure Tolerance in Large-Scale Cloud Computing Systems
    Luo, Liang
    Meng, Sa
    Qiu, Xiwei
    Dai, Yuanshun
    IEEE TRANSACTIONS ON RELIABILITY, 2019, 68 (02) : 620 - 632
  • [29] Cloud Automation to Run Large-Scale Quantum Chemical Simulations
    AlRayhi, N.
    Salah, K.
    Al-Kork, N.
    Bentiba, A.
    Trabelsi, Z.
    Azad, M. A.
    PROCEEDINGS OF THE 2018 13TH INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION TECHNOLOGY (IIT), 2018, : 75 - 80
  • [30] Muclouds: Parallel Simulator for Large-scale Cloud Computing Systems
    Liu, Jinzhao
    Zhou, Yuezhi
    Zhang, Di
    Fang, Yujian
    Han, Wei
    Zhang, Yaoxue
    2014 IEEE 11TH INTL CONF ON UBIQUITOUS INTELLIGENCE AND COMPUTING AND 2014 IEEE 11TH INTL CONF ON AUTONOMIC AND TRUSTED COMPUTING AND 2014 IEEE 14TH INTL CONF ON SCALABLE COMPUTING AND COMMUNICATIONS AND ITS ASSOCIATED WORKSHOPS, 2014, : 80 - 87