Computing global functions in asynchronous distributed systems with perfect failure detectors

被引:34
作者
Hélary, JM [1 ]
Hurfin, M [1 ]
Mostefaoui, A [1 ]
Raynal, M [1 ]
Tronel, F [1 ]
机构
[1] Inst Rech Informat & Syst Aleatoires, F-35042 Rennes, France
关键词
asynchronous distributed computation; global data; global function computation; perfect failure detector; problem reduction; process crash;
D O I
10.1109/71.879773
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A Global Data is a vector with one entry per process. Each entry must be filled with an appropriate value provided by the corresponding process. Several distributed computing problems amount to compute a function on a global data. This paper proposes a protocol to solve such problems in the context of asynchronous distributed systems where processes may fail by crashing. The main problem that has to be solved lies in computing the global data and in providing each noncrashed process with a copy of it, despite the possible crash of some processes. To be consistent, the global data must contain, at least, all the values provided by the processes that do not crash. This defines the Global Data Computation (GDC) problem. To solve this problem, processes execute a sequence of asynchronous rounds during which they construct, in a decentralized way, the value of the global data and eventually each process gets a copy of it. To cope with process crashes, the protocol uses a perfect failure detector. The proposed protocol has been designed to be time efficient: it allows early decision. Let t be the maximum number of processes that may crash, t < n where n is the total number of processes, and f be the actual number of process crashes (f <less than or equal to> t). In the worst case, the protocol terminates in min(2f + 2, t + 1) rounds. Moreover, the protocol does not require processes to exchange information on their perception of crashes. The message size depends only on the size of the global data.
引用
收藏
页码:897 / 909
页数:13
相关论文
共 26 条
[1]   COMPLEXITY OF NETWORK SYNCHRONIZATION [J].
AWERBUCH, B .
JOURNAL OF THE ACM, 1985, 32 (04) :804-823
[2]  
BERMOND JC, 1987, P 2 INT WORKSH DISTR, P41
[3]  
Bernstein P.A., 1987, Concurrency Control and Recovery in Database Systems
[4]   Unreliable failure detectors for reliable distributed systems [J].
Chandra, TD ;
Toueg, S .
JOURNAL OF THE ACM, 1996, 43 (02) :225-267
[5]   The weakest failure detector for solving Consensus [J].
Chandra, TD ;
Hadzilacos, V ;
Toueg, S .
JOURNAL OF THE ACM, 1996, 43 (04) :685-722
[6]  
De Prycker M., 1995, Asynchronous Transfer Mode solution for broadband ISDN
[7]   EARLY STOPPING IN BYZANTINE AGREEMENT [J].
DOLEV, D ;
REISCHUK, R ;
STRONG, HR .
JOURNAL OF THE ACM, 1990, 37 (04) :720-741
[8]  
Doudou A, 1999, LECT NOTES COMPUT SC, V1667, P71
[9]   IMPOSSIBILITY OF DISTRIBUTED CONSENSUS WITH ONE FAULTY PROCESS [J].
FISCHER, MJ ;
LYNCH, NA ;
PATERSON, MS .
JOURNAL OF THE ACM, 1985, 32 (02) :374-382
[10]   On classes of problems in asynchronous distributed systems with process crashes [J].
Fromentin, E ;
Raynal, M ;
Tronel, F .
19TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, PROCEEDINGS, 1999, :470-477