Auto-scaling of Web Applications in Clouds: A Tail Latency Evaluation

被引:6
作者
Aslanpour, Mohammad S. [1 ,2 ]
Toosi, Adel N. [1 ]
Gaire, Raj [2 ]
Cheema, Muhammad Aamir [1 ]
机构
[1] Monash Univ, Fac Informat Technol, Clayton, Vic, Australia
[2] CSIROs Data61, Canberra, ACT, Australia
来源
2020 IEEE/ACM 13TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC 2020) | 2020年
关键词
cloud computing; auto-scaling; tail latency; resource provisioning; performance evaluation;
D O I
10.1109/UCC48980.2020.00037
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mechanisms for dynamically adding and removing Virtual Machines (VMs) to reduce cost while minimizing the latency are called auto-scaling. Latency improvements are mainly fulfilled through minimizing the "average" response times while unpredictabilities and fluctuations of the Web applications, aka flash crowds, can result in very high latencies for users' requests. Requests influenced by flash crowd suffer from long latencies, known as outliers. Such outliers are inevitable to a large extent as auto-scaling solutions continue to improve the average, not the "tail" of latencies. In this paper, we study possible sources of tail latency in auto-scaling mechanisms for Web applications. Based on our extensive evaluations in a real cloud platform, we discovered sources of a tail latency as 1) large requests, i.e. those data-intensive; 2) long-term scaling intervals; 3) instant analysis of scaling parameters; 4) conservative, i.e. tight, threshold tuning; 5) load-unaware surplus VM selection policies used for executing a scale-down decision; 6) cooldown feature, although cost-effective; and 7) VM start-up delay. We also discovered that after improving the average latency by auto-scaling mechanisms, the tail may behave differently, demanding dedicated tail-aware solutions for auto-scaling mechanisms.
引用
收藏
页码:186 / 195
页数:10
相关论文
共 25 条
  • [1] Impact of CPU Utilization Thresholds and Scaling Size on Autoscaling Cloud Resources
    Al-Haidari, F.
    Sqalli, M.
    Salah, K.
    [J]. 2013 IEEE FIFTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), VOL 2, 2013, : 256 - 261
  • [2] [Anonymous], 2009, THESIS
  • [3] Antonescu A.-F., 2015, FUTURE GENERATION CO
  • [4] Internet Web servers: Workload characterization and performance implications
    Arlitt, MF
    Williamson, CL
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 1997, 5 (05) : 631 - 645
  • [5] Monitoring the road traffic crashes using NEWMA chart and repetitive sampling
    Aslam, Muhammad
    [J]. INTERNATIONAL JOURNAL OF INJURY CONTROL AND SAFETY PROMOTION, 2020, 28 (01) : 39 - 45
  • [6] LARPA: A learning automata-based resource provisioning approach for massively multiplayer online games in cloud environments
    Aslanpour, Mohammad Sadegh
    Ghobaei-Arani, Mostafa
    Heydari, Morteza
    Mahmoudi, Nader
    [J]. INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2019, 32 (14)
  • [7] Aslanpour MS, 2017, INT J GRID HIGH PERF, V9, P1, DOI 10.4018/IJGHPC.2017070101
  • [8] Auto-scaling web applications in clouds: A cost-aware approach
    Aslanpour, Mohammad Sadegh
    Ghobaei-Arani, Mostafa
    Toosi, Adel Nadjaran
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2017, 95 : 26 - 41
  • [9] Aslanpour MS, 2016, 2016 SECOND INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), P31, DOI 10.1109/ICWR.2016.7498443
  • [10] Automatic provisioning of multi-tier applications in cloud computing environments
    Beltran, Marta
    [J]. JOURNAL OF SUPERCOMPUTING, 2015, 71 (06) : 2221 - 2250