Cloud Service Reliability Enhancement via Virtual Machine Placement Optimization

被引:122
作者
Zhou, Ao [1 ]
Wang, Shangguang [1 ]
Cheng, Bo [1 ]
Zheng, Zibin [2 ]
Yang, Fangchun [1 ]
Chang, Rong N. [3 ]
Lyu, Michael R. [4 ]
Buyya, Rajkumar [5 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing, Peoples R China
[2] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou, Guangdong, Peoples R China
[3] IBM Res, Yorktown Hts, NY USA
[4] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Hong Kong, Peoples R China
[5] Univ Melbourne, Cloud Comp & Distributed Syst CLOUDS Lab, Dept Comp & Informat Syst, Melbourne, Vic, Australia
关键词
Cloud computing; cloud service; reliability; fault-tolerance; datacenter; network resource; COMPONENT RANKING; SIMULATION;
D O I
10.1109/TSC.2016.2519898
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With rapid adoption of the cloud computing model, many enterprises have begun deploying cloud-based services. Failures of virtual machines (VMs) in clouds have caused serious quality assurance issues for those services. VM replication is a commonly used technique for enhancing the reliability of cloud services. However, when determining the VM redundancy strategy for a specific service, many state-of-the-art methods ignore the huge network resource consumption issue that could be experienced when the service is in failure recovery mode. This paper proposes a redundant VM placement optimization approach to enhancing the reliability of cloud services. The approach employs three algorithms. The first algorithm selects an appropriate set of VM-hosting servers from a potentially large set of candidate host servers based upon the network topology. The second algorithm determines an optimal strategy to place the primary and backup VMs on the selected host servers with k-fault-tolerance assurance. Lastly, a heuristic is used to address the task-to-VM reassignment optimization problem, which is formulated as finding a maximum weight matching in bipartite graphs. The evaluation results show that the proposed approach outperforms four other representative methods in network resource consumption in the service recovery stage.
引用
收藏
页码:902 / 913
页数:12
相关论文
共 39 条
[1]   A scalable, commodity data center network architecture [J].
Al-Fares, Mohammad ;
Loukissas, Alexander ;
Vahdat, Amin .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2008, 38 (04) :63-74
[2]  
[Anonymous], 2011, US GOVT CLOUD COMPUT, VI
[3]  
[Anonymous], 2009, CLOUDS BERKELEY VIEW
[4]  
[Anonymous], 2013, INTRO DESIGNING RELI
[5]  
[Anonymous], 2012, Reliability and availability of cloud computing
[6]  
[Anonymous], 2012, APPL MECH MAT, DOI DOI 10.4028/WWW.SCIENTIFIC.NET/AMM.204-208.2261
[7]  
[Anonymous], 1996, HDB SOFTWARE RELIABI
[8]  
[Anonymous], 2013, DSN 13, DOI DOI 10.1109/DSN.2013.6575322
[9]  
[Anonymous], 2007, CISC DAT CTR INFR 2
[10]   A View of Cloud Computing [J].
Armbrust, Michael ;
Fox, Armando ;
Griffith, Rean ;
Joseph, Anthony D. ;
Katz, Randy ;
Konwinski, Andy ;
Lee, Gunho ;
Patterson, David ;
Rabkin, Ariel ;
Stoica, Ion ;
Zaharia, Matei .
COMMUNICATIONS OF THE ACM, 2010, 53 (04) :50-58