Scalable Analytics for IaaS Cloud Availability

被引:87
作者
Ghosh, Rahul [1 ]
Longo, Francesco [2 ]
Frattini, Flavio [3 ]
Russo, Stefano [3 ]
Trivedi, Kishor S. [4 ]
机构
[1] IBM Corp, Durham, NC 27709 USA
[2] Univ Messina, Dipartimento Matemat, Contrada Dio S Agata, I-98164 Messina, Italy
[3] Univ Napoli Federico II, Dept Elect Engn & Informat Technol, I-80125 Naples, Italy
[4] Duke Univ, Dept Elect & Comp Engn, Durham, NC 27708 USA
关键词
Analytic-numeric solution; availability; downtime; cloud computing; simulation; stochastic reward nets;
D O I
10.1109/TCC.2014.2310737
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In a large Infrastructure-as-a-Service (IaaS) cloud, component failures are quite common. Such failures may lead to occasional system downtime and eventual violation of Service Level Agreements (SLAs) on the cloud service availability. The availability analysis of the underlying infrastructure is useful to the service provider to design a system capable of providing a defined SLA, as well as to evaluate the capabilities of an existing one. This paper presents a scalable, stochastic model-driven approach to quantify the availability of a large-scale IaaS cloud, where failures are typically dealt with through migration of physical machines among three pools: hot (running), warm (turned on, but not ready), and cold (turned off). Since monolithic models do not scale for large systems, we use an interacting Markov chain based approach to demonstrate the reduction in the complexity of analysis and the solution time. The three pools are modeled by interacting sub-models. Dependencies among them are resolved using fixed-point iteration, for which existence of a solution is proved. The analytic-numeric solutions obtained from the proposed approach and from the monolithic model are compared. We show that the errors introduced by interacting sub-models are insignificant and that our approach can handle very large size IaaS clouds. The simulative solution is also considered for the proposed model, and solution time of the methods are compared.
引用
收藏
页码:57 / 70
页数:14
相关论文
共 25 条
[1]   RELIABILITY-ANALYSIS OF INTERCONNECTION NETWORKS USING HIERARCHICAL COMPOSITION [J].
BLAKE, JT ;
TRIVEDI, KS .
IEEE TRANSACTIONS ON RELIABILITY, 1989, 38 (01) :111-120
[2]  
Bonvin N, 2009, FIRST WORKSHOP ON AUTOMATED CONTROL FOR DATACENTERS AND CLOUDS (ACDC '09), P49
[3]  
Callou G, 2011, IEEE SYS MAN CYBERN, P398, DOI 10.1109/ICSMC.2011.6083698
[4]  
Chen H., 2010, P IEEE 2 INT C CLOUD
[5]   A DECOMPOSITION APPROACH FOR STOCHASTIC REWARD NET MODELS [J].
CIARDO, G ;
TRIVEDI, KS .
PERFORMANCE EVALUATION, 1993, 18 (01) :37-59
[6]  
Ciardo G., 1993, IMA VOLUMES MATH ITS, V48, P145
[7]  
Dai Y. S., 2009, P IEEE PAC RIM INT S
[8]  
He S., 2012, P IEEE 5 INT C CLOUD
[9]  
Hirel C., 2000, P 11 INT C COMP PERF, V1768
[10]   Discovering Statistical Models of Availability in Large Distributed Systems: An Empirical Study of SETI@home [J].
Javadi, Bahman ;
Kondo, Derrick ;
Vincent, Jean-Marc ;
Anderson, David P. .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2011, 22 (11) :1896-1903