State Monitoring in Cloud Datacenters

被引:23
|
作者
Meng, Shicong [1 ]
Liu, Ling [2 ]
Wang, Ting [3 ]
机构
[1] Georgia Inst Technol, Coll Comp, GT Stn 37975, Atlanta, GA 30332 USA
[2] Georgia Inst Technol, Coll Comp, KACB, Atlanta, GA 30332 USA
[3] Georgia Inst Technol, Coll Comp, Georgia Tech Stn 329544, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
State monitoring; datacenter; cloud; distributed; aggregation; tuning;
D O I
10.1109/TKDE.2011.70
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monitoring global states of a distributed cloud application is a critical functionality for cloud datacenter management. State monitoring requires meeting two demanding objectives: high level of correctness, which ensures zero or low error rate, and high communication efficiency, which demands minimal communication cost in detecting state updates. Most existing work follows an instantaneous model which triggers state alerts whenever a constraint is violated. This model may cause frequent and unnecessary alerts due to momentary value bursts and outliers. Countermeasures of such alerts may further cause problematic operations. In this paper, we present a WIndow-based StatE monitoring (WISE) framework for efficiently managing cloud applications. Window-based state monitoring reports alerts only when state violation is continuous within a time window. We show that it is not only more resilient to value bursts and outliers, but also able to save considerable communication when implemented in a distributed manner based on four technical contributions. First, we present the architectural design and deployment options for window-based state monitoring with centralized parameter tuning. Second, we develop a new distributed parameter tuning scheme enabling WISE to scale to much more monitoring nodes as each node tunes its monitoring parameters reactively without global information. Third, we introduce two optimization techniques, including their design rationale, correctness and usage model, to further reduce the communication cost. Finally, we provide an in-depth empirical study of the scalability of WISE, and evaluate the improvement brought by the distributed tuning scheme and the two performance optimizations. Our results show that WISE reduces communication by 50-90 percent compared with instantaneous monitoring approaches, and the improved WISE gains a clear scalability advantage over its centralized version.
引用
收藏
页码:1328 / 1344
页数:17
相关论文
共 50 条
  • [21] Straggler Root-Cause and Impact Analysis for Massive-scale Virtualized Cloud Datacenters
    Garraghan, Peter
    Ouyang, Xue
    Yang, Renyu
    McKee, David
    Xu, Jie
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2019, 12 (01) : 91 - 104
  • [22] Middleware support for seamless integration of domain specific cores in cloud datacenters
    Ezzeddine, Mazen
    Morcel, Raghid
    Akkary, Haitham
    2016 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD), 2016, : 118 - 125
  • [23] OpenDC 2.0: Convenient Modeling and Simulation of Emerging Technologies in Cloud Datacenters
    Mastenbroek, Fabian
    Andreadis, Georgios
    Jounaid, Soufiane
    Lai, Wenchen
    Burley, Jacob
    Bosch, Jaro
    van Eyk, Erwin
    Versluis, Laurens
    van Beek, Vincent
    Iosup, Alexandru
    21ST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2021), 2021, : 455 - 464
  • [24] Sublimated Configuration of Infrastructure as a Service Deployments MING: A Model- and View-Based Approach for Cloud Datacenters
    Holmes, Ta'id
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, VOL 2 (CLOSER), 2016, : 308 - 313
  • [25] A Practical Approach to Hard Disk Failure Prediction in Cloud Platforms Big Data Model for Failure Management in Datacenters
    Ganguly, Sandipan
    Consul, Ashish
    Khan, Ali
    Bussone, Brian
    Richards, Jacqueline
    Miguel, Alejandro
    PROCEEDINGS 2016 IEEE SECOND INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE 2016), 2016, : 105 - 116
  • [26] Towards Network-topology aware Virtual Machine Placement in Cloud Datacenters
    Yuchi, Xuebiao
    Shetty, Sachin
    Proceedings 2016 IEEE World Congress on Services - SERVICES 2016, 2016, : 95 - 96
  • [27] Method and Framework for Virtual Machine Consolidation without affecting QoS in Cloud Datacenters
    Malik, Pooja
    Yadav, Vikram
    Kumar, Adarsh
    Kumar, Ranjan
    Sahoo, G.
    2016 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING IN EMERGING MARKETS (CCEM), 2016, : 141 - 146
  • [28] A Distributed Self-Balancing Policy for Virtual Machine Management in Cloud Datacenters
    Loreti, Daniela
    Ciampolini, Anna
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 391 - 398
  • [29] Network Management and Monitoring for Cloud Systems
    Suciu, George
    Halunga, Simona
    Ochian, Adelina
    Suciu, Victor
    PROCEEDINGS OF THE 2014 6TH INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTERS AND ARTIFICIAL INTELLIGENCE (ECAI), 2014,
  • [30] Adaptive Provisioning for Evolving Virtual Network Request in Cloud-based Datacenters
    Sun, Gang
    Anand, Vishal
    Yu, Hong-Fang
    Liao, Dan
    Cai, Yanyang
    Li, Le Min
    2012 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2012, : 1617 - 1622