An accrual failure detector in cloud computing

被引:0
作者
Wu J. [1 ]
Liu J. [2 ]
Dong J. [1 ]
Zuo D. [1 ]
Zhao Y. [1 ]
机构
[1] Fault-tolerant and Mobile Computing Research Center, Harbin Institute of Technology, Harbin
[2] Jiangsu Key Laboratory for Broadband Wireless Communication and Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing
来源
Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology | 2019年 / 51卷 / 11期
关键词
Accrual failure detector; Cloud computing; Quality of service; Weibull distribution;
D O I
10.11918/j.issn.0367-6234.201903149
中图分类号
学科分类号
摘要
In order to better solve the problem that the performance of failure detection is effected by dynamic of network environment in cloud computing, a new adaptive accrual failure detector (Two Windows Accrual Failure Detector, 2WA-FD) was proposed. First, two groups of actual data from two network conditions were analyzed, and we found that the Weibull distribution is a more reasonable distribution assumption for heartbeat inter-arrival time. According to the Weibull distribution, the suspicion level of accrual failure detector is more accurate. Second, the framework of accrual failure detector was analyzed and improved, and the suspicion level was calculated by two sliding windows. This framework is fit for dealing with the dynamic of network conditions. Finally, the 2WA-FD and other failure detectors were tested on open source experimental data and our experimental platform. The experimental results show that the 2WA-FD has better performance in terms of low detection time and high detection accuracy with the same detection overhead. Thus, the 2WA-FD can accurately and quickly find out the node failures in cloud computing, and effectively reduce the influence of dynamic on the performance of failure detection. © 2019, Editorial Board of Journal of Harbin Institute of Technology. All right reserved.
引用
收藏
页码:16 / 21
页数:5
相关论文
共 13 条
[1]  
Pannu H.S., Liu J., Guan Q., Et al., AFD: Adaptive failure detection system for cloud computing infrastructures, Proc 31st IEEE International Performance Computing and Communications Conference, (2012)
[2]  
Dabbagh M., Hamdaoui B., Guizani M., Et al., Toward energy-efficient cloud computing: Prediction, consolidation, and overcommitment, IEEE Network, 29, 2, (2015)
[3]  
Zhou A., Wang S., Zhen Z., Et al., On cloud service reliability enhancement with optimal resource usage, IEEE Transactions on Cloud Computing, 4, 4, (2016)
[4]  
Fetzer C., Raynal M., Tronel F., An adaptive failure detection protocol, Proc the 2001 Pacific Rim International Symposium on Dependable Computing, (2001)
[5]  
Ding Y., Yao G.S., Hao K., Fault-tolerant elastic scheduling algorithm for workflow in Cloud systems, Information Sciences, 393, (2017)
[6]  
Lin R., Wu B., Yang F., Et al., An efficient adaptive failure detection mechanism for cloud platform based on volterra series, China Communications, 4, 11, (2014)
[7]  
Lavinia A., Dobre C., Pop F., Et al., A failure detection system for large scale distributed systems, Proc the 2010 International Conference on Complex, Intelligent and Software Intensive Systems, (2010)
[8]  
Defago X., Urban P., Hayashibara N., Et al., Definition and specification of accrual failure detectors, Proc the International Conference on Dependable Systems and Networks, (2005)
[9]  
Hayashibara N., Defago X., Yared R., Et al., The φ accrual failure detector, Proc the 23rd IEEE International Symposium on Reliable Distributed Systems, (2004)
[10]  
Tomsic A., Sens P., Garcia J., Et al., 2W-FD: a failure detector algorithm with QoS, Proc the 29th International Parallel and Distributed Processing Symposium, pp. 885-893, (2015)