A Fault-Tolerant Token-Based Atomic Broadcast Algorithm

被引:2
作者
Ekwall, Richard [1 ,2 ]
Schiper, Andre [3 ]
机构
[1] Ecole Polytech Fed Lausanne, Distributed Syst Lab, Lausanne, Switzerland
[2] Google Switzerland GmbH, CH-8002 Zurich, Switzerland
[3] Ecole Polytech Fed Lausanne, Ecole Polytech, Stn 14, LSR IIF I&C, CH-1015 Lausanne, Switzerland
关键词
Distributed systems; fault tolerance; performance measurements; GROUP MEMBERSHIP;
D O I
10.1109/TDSC.2010.24
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many atomic broadcast algorithms have been published in the last 20 years. Token-based algorithms represent a large class of these algorithms. Interestingly, all the token-based atomic broadcast algorithms rely on a group membership service and none of them uses unreliable failure detectors directly. This paper presents the first token-based atomic broadcast algorithm that uses an unreliable failure detector instead of a group membership service. It requires a system size that is quadratic in the number of supported failures. The special case of a single supported failure (f = 1) requires n = 3 processes. We experimentally evaluate the performance of this algorithm in local and wide area networks, in order to emphasize that atomic broadcast is efficiently implemented by combining a failure detector and a token-based mechanism. The evaluation shows that the new token-based algorithm surpasses the performance of the other algorithms in most small-system settings.
引用
收藏
页码:625 / 639
页数:15
相关论文
共 38 条
[1]  
ALVISI L, 1998, P 2 INF SURV WORKSH, P5
[2]   THE TOTEM SINGLE-RING ORDERING AND MEMBERSHIP PROTOCOL [J].
AMIR, Y ;
MOSER, LE ;
MELLIARSMITH, PM ;
AGARWAL, DA ;
CIARFELLA, P .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1995, 13 (04) :311-342
[3]  
ANKER T, 2003, P INT WORKSH LARG SC
[4]  
[Anonymous], 1985, 8024 ANSIIEEE
[5]  
BAKR O, 2002, P 21 ANN S PRINC DIS, P243
[6]  
BIRMAN K, 1991, ACM T COMPUT SYST, V9, P272, DOI 10.1145/128738.128742
[7]  
CAPPELLO F, 2005, P IEEE ACM WORKSH GR
[8]   Unreliable failure detectors for reliable distributed systems [J].
Chandra, TD ;
Toueg, S .
JOURNAL OF THE ACM, 1996, 43 (02) :225-267
[9]  
Chandra T, 2007, PODC'07: PROCEEDINGS OF THE 26TH ANNUAL ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, P398
[10]   RELIABLE BROADCAST PROTOCOLS [J].
CHANG, JM ;
MAXEMCHUK, NF .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1984, 2 (03) :251-273