State Monitoring in Cloud Datacenters

被引:23
|
作者
Meng, Shicong [1 ]
Liu, Ling [2 ]
Wang, Ting [3 ]
机构
[1] Georgia Inst Technol, Coll Comp, GT Stn 37975, Atlanta, GA 30332 USA
[2] Georgia Inst Technol, Coll Comp, KACB, Atlanta, GA 30332 USA
[3] Georgia Inst Technol, Coll Comp, Georgia Tech Stn 329544, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
State monitoring; datacenter; cloud; distributed; aggregation; tuning;
D O I
10.1109/TKDE.2011.70
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monitoring global states of a distributed cloud application is a critical functionality for cloud datacenter management. State monitoring requires meeting two demanding objectives: high level of correctness, which ensures zero or low error rate, and high communication efficiency, which demands minimal communication cost in detecting state updates. Most existing work follows an instantaneous model which triggers state alerts whenever a constraint is violated. This model may cause frequent and unnecessary alerts due to momentary value bursts and outliers. Countermeasures of such alerts may further cause problematic operations. In this paper, we present a WIndow-based StatE monitoring (WISE) framework for efficiently managing cloud applications. Window-based state monitoring reports alerts only when state violation is continuous within a time window. We show that it is not only more resilient to value bursts and outliers, but also able to save considerable communication when implemented in a distributed manner based on four technical contributions. First, we present the architectural design and deployment options for window-based state monitoring with centralized parameter tuning. Second, we develop a new distributed parameter tuning scheme enabling WISE to scale to much more monitoring nodes as each node tunes its monitoring parameters reactively without global information. Third, we introduce two optimization techniques, including their design rationale, correctness and usage model, to further reduce the communication cost. Finally, we provide an in-depth empirical study of the scalability of WISE, and evaluate the improvement brought by the distributed tuning scheme and the two performance optimizations. Our results show that WISE reduces communication by 50-90 percent compared with instantaneous monitoring approaches, and the improved WISE gains a clear scalability advantage over its centralized version.
引用
收藏
页码:1328 / 1344
页数:17
相关论文
共 50 条
  • [1] Volley: Violation Likelihood Based State Monitoring for Datacenters
    Meng, Shicong
    Iyengar, Arun K.
    Rouvellou, Isabelle M.
    Liu, Ling
    2013 IEEE 33RD INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2013, : 1 - 10
  • [2] Complex Cloud Datacenters
    Filiposka, Sonja
    Juiz, Carlos
    INTERNATIONAL CONFERENCE ON APPLIED COMPUTING, COMPUTER SCIENCE, AND COMPUTER ENGINEERING (ICACC 2013), 2014, 7 : 8 - 14
  • [3] The Future of FPGA Acceleration in Datacenters and the Cloud
    Bobda, Christophe
    Mbongue, Joel Mandebi
    Chow, Paul
    Ewais, Mohammad
    Tarafdar, Naif
    Vega, Juan Camilo
    Eguro, Ken
    Koch, Dirk
    Handagala, Suranga
    Leeser, Miriam
    Herbordt, Martin
    Shahzad, Hafsah
    Hofste, Peter
    Ringlein, Burkhard
    Szefer, Jakub
    Sanaullah, Ahmed
    Tessier, Russell
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2022, 15 (03)
  • [4] Trends and Challenges in Cloud Datacenters
    Bilal, Kashif
    Malik, Saif Ur Rehman
    Khan, Samee U.
    Zomaya, Albert Y.
    IEEE CLOUD COMPUTING, 2014, 1 (01) : 10 - 20
  • [5] Cloud2HDD: Large-Scale HDD Data Analysis on Cloud for Cloud Datacenters
    Zeydan, Engin
    Arslan, Suayb S.
    2020 23RD CONFERENCE ON INNOVATION IN CLOUDS, INTERNET AND NETWORKS AND WORKSHOPS (ICIN 2020), 2020, : 243 - 249
  • [6] Characterizing servers workload in Cloud Datacenters
    Gbaguidi, Frejus
    Boumerdassi, Selma
    Renault, Eric
    Ezin, Eugene
    2015 3RD INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD (FICLOUD) AND INTERNATIONAL CONFERENCE ON OPEN AND BIG (OBD), 2015, : 657 - 661
  • [7] A Comprehensive Optimization for Performance, Energy Efficiency and Maintenance in Cloud Datacenters
    Zhang, Puheng
    Lin, Chuang
    Meng, Kun
    Chen, Ying
    2016 IEEE TRUSTCOM/BIGDATASE/ISPA, 2016, : 1264 - 1271
  • [8] Smart Cloud Seeding for BitTorrent in Datacenters
    Leon, Xavier
    Chaabouni, Rahma
    Sanchez-Artigas, Marc
    Garcia-Lopez, Pedro
    IEEE INTERNET COMPUTING, 2014, 18 (04) : 47 - 54
  • [9] Adversarial Impact on Anomaly Detection in Cloud Datacenters
    Deka, Pratyush Kr.
    Bhuyan, Monowar H.
    Kadobayashi, Youki
    Elmroth, Erik
    2019 IEEE 24TH PACIFIC RIM INTERNATIONAL SYMPOSIUM ON DEPENDABLE COMPUTING (PRDC 2019), 2019, : 188 - 197
  • [10] Solar-Powered Cloud Computing Datacenters
    Hosman, Laura
    Baikie, Bruce
    IT PROFESSIONAL, 2013, 15 (02) : 15 - 21