Failure detectors in homonymous distributed systems (with an application to consensus)

被引:10
作者
Arevalo, Sergio [1 ]
Fernandez Anta, Antonio [2 ]
Imbs, Damien [3 ]
Jimenez, Ernesto [1 ,4 ]
Raynal, Michel [5 ,6 ]
机构
[1] Univ Politecn Madrid, Madrid 28031, Spain
[2] Inst IMDEA Networks, Madrid 28918, Spain
[3] Univ Bremen, Dept Math, D-28334 Bremen, Germany
[4] EPN, Quito, Ecuador
[5] Univ Rennes 1, Inst Univ France, F-35042 Rennes, France
[6] Univ Rennes 1, IRISA, F-35042 Rennes, France
关键词
Agreement problem; Asynchrony; Consensus; Distributed computability; Failure detector; Homonymous systems; Message-passing; Process crash;
D O I
10.1016/j.jpdc.2015.05.007
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper is on homonymous distributed systems where processes are prone to crash failures and have no initial knowledge of the system membership ("homonymous" means that several processes may have the same identifier). New classes of failure detectors suited to these systems are first defined. Among them, the classes H Omega and H Sigma are introduced that are the homonymous counterparts of the classes Omega and Sigma, respectively. (Recall that the pair <Omega, Sigma > defines the weakest failure detector to solve consensus.) Then, the paper shows how H Omega and H Sigma can be implemented in homonymous systems without membership knowledge (under different synchrony requirements). Finally, two algorithms are presented that use these failure detectors to solve consensus in homonymous asynchronous systems where there is no initial knowledge of the membership. One algorithm solves consensus with < H Omega, H Sigma >, while the other uses only HQ, but needs a majority of correct processes. Observe that the systems with unique identifiers and anonymous systems are extreme cases of homonymous systems from which follows that all these results also apply to these systems. Interestingly, the new failure detector class H Omega can be implemented with partial synchrony (i.e., all messages sent after some bounded time GST will be received after at most an unknown bounded latency delta), while the analogous class A Omega defined for anonymous systems cannot be implemented (even in synchronous systems). Hence, the paper provides the first consensus algorithm for anonymous systems with this model of partial synchrony and a majority of correct processes. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:83 / 95
页数:13
相关论文
共 22 条
[1]  
Angluin Dana, 1980, P 12 ANN ACM S THEOR, P82, DOI DOI 10.1145/800141.804655
[2]   COMPUTING ON AN ANONYMOUS RING [J].
ATTIYA, H ;
SNIR, M ;
WARMUTH, MK .
JOURNAL OF THE ACM, 1988, 35 (04) :845-875
[3]  
Bonnet F., 2010, 1945 PI IRISA
[4]   Anonymous asynchronous systems: the case of failure detectors [J].
Bonnet, Francois ;
Raynal, Michel .
DISTRIBUTED COMPUTING, 2013, 26 (03) :141-158
[5]   The Price of Anonymity: Optimal Consensus Despite Asynchrony, Crash, and Anonymity [J].
Bonnet, Francois ;
Raynal, Michel .
ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 2011, 6 (04)
[6]   Consensus in Anonymous Distributed Systems: Is There a Weakest Failure Detector? [J].
Bonnet, Francois ;
Raynal, Michel .
2010 24TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2010, :206-213
[7]  
Bouzid Z, 2012, LECT NOTES COMPUT SC, V7611, P427, DOI 10.1007/978-3-642-33651-5_41
[8]  
Bouzid Z, 2011, LECT NOTES COMPUT SC, V7109, P175, DOI 10.1007/978-3-642-25873-2_13
[9]   Unreliable failure detectors for reliable distributed systems [J].
Chandra, TD ;
Toueg, S .
JOURNAL OF THE ACM, 1996, 43 (02) :225-267
[10]   The weakest failure detector for solving Consensus [J].
Chandra, TD ;
Hadzilacos, V ;
Toueg, S .
JOURNAL OF THE ACM, 1996, 43 (04) :685-722