On the Quality of Service of Crash-Recovery Failure Detectors

被引:9
作者
Ma, Tiejun [1 ]
Hillston, Jane [2 ]
Anderson, Stuart [2 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Comp, S Kensington Campus,180 Queens Gate London, London SW7 2AZ, England
[2] Univ Edinburgh, Sch Informat, Lab Fdn Comp Sci, Edinburgh EH8 9AB, Midlothian, Scotland
关键词
Failure detectors; crash recovery; quality of service; availability; dependability; performance; CONSENSUS;
D O I
10.1109/TDSC.2009.35
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We model the probabilistic behavior of a system comprising a failure detector and a monitored crash-recovery target. We extend failure detectors to take account of failure recovery in the target system. This involves extending QoS measures to include the recovery detection speed and proportion of failures detected. We also extend estimating the parameters of the failure detector to achieve a required QoS to configuring the crash-recovery failure detector. We investigate the impact of the dependability of the monitored process on the QoS of our failure detector. Our analysis indicates that variation in the MTTF and MTTR of the monitored process can have a significant impact on the QoS of our failure detector. Our analysis is supported by simulations that validate our theoretical results.
引用
收藏
页码:271 / 283
页数:13
相关论文
共 29 条
  • [1] Failure detection and consensus in the crash-recovery model
    Aguilera, MK
    Chen, W
    Toueg, S
    [J]. DISTRIBUTED COMPUTING, 2000, 13 (02) : 99 - 125
  • [2] [Anonymous], PODC 01 P ANN ACM S
  • [3] [Anonymous], 1992, Dependability: Basic Concepts and Terminology
  • [4] Implementation and performance evaluation of an adaptable failure detector
    Bertier, M
    Marin, O
    Sens, P
    [J]. INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2002, : 354 - 363
  • [5] BOICHAT R, 2001, THESIS ECOLE POLYTEC
  • [6] Chandra T. D., 1991, Proceedings of the Tenth Annual ACM Symposium on Principles of Distributed Computing, P325, DOI 10.1145/112600.112627
  • [7] On the quality of service of failure detectors
    Chen, W
    Toueg, S
    Aguilera, MK
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2002, 51 (05) : 561 - 580
  • [8] Dashofy Eric M, 2002, P 1 WORKSH SELF HEAL, P21
  • [9] DOLEV D, 1996, 961608 CORN U DEP CO
  • [10] Experimental evaluation of the QoS of failure detectors on wide area network
    Falai, L
    Bondavalli, A
    [J]. 2005 INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2005, : 624 - 633