Deterministic Fault-Tolerant Distributed Computing in Linear Time and Communication

被引:1
|
作者
Chlebus, Bogdan S. [1 ]
Kowalski, Dariusz R. [1 ]
Olkowski, Jan [2 ]
机构
[1] Augusta Univ, Augusta, GA 30912 USA
[2] Univ Maryland, Baltimore, MD 21201 USA
来源
PROCEEDINGS OF THE 2023 ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, PODC 2023 | 2023年
关键词
crash failure; Byzantine fault; authentication; consensus; check-pointing; gossiping; Ramanujan graph; lower bound; RANDOMIZED CONSENSUS; MESSAGE COMPLEXITY; BYZANTINE; AGREEMENT; ALGORITHMS; PROTOCOLS; BOUNDS;
D O I
10.1145/3583668.3594599
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We develop deterministic algorithms for the problems of consensus, gossiping and checkpointing with nodes prone to failing. Distributed systems are modeled as synchronous complete networks. Failures are represented either as crashes or Byzantine faults with authentication. The algorithmic goal is to have both linear running time and linear amount of communication for as large an upper bound t on the number of faults as possible, with respect to the number of nodes n. For crash failures, these bounds of optimality are t = O (n/log n) for consensus and t = O (n/log(2) n) for gossiping and checkpointing. For the model of Byzantine faults with authentication, we show how to reach consensus in both linear running time and communication for t = O(root n). We show how to implement the consensus algorithm for crash failures in the single-port model such as to preserve the range of t for which both the running time and communication are optimal. We prove a lower bound Omega(l +log n) on the running time of algorithms for each of the considered problems.
引用
收藏
页码:344 / 354
页数:11
相关论文
共 50 条
  • [1] Reconciling fault-tolerant distributed algorithms and real-time computing
    Moser, Heinrich
    Schmid, Ulrich
    DISTRIBUTED COMPUTING, 2014, 27 (03) : 203 - 230
  • [2] An adaptive programming model for fault-tolerant distributed computing
    Gorender, Sergio
    Macedo, Raimundo Jose de Araujo
    Raynal, Michel
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2007, 4 (01) : 18 - 31
  • [3] A hybrid and adaptive model for fault-tolerant distributed computing
    Gorender, S
    Macêdo, R
    Raynal, M
    2005 INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2005, : 412 - 421
  • [4] Improved Communication Complexity of Fault-Tolerant Consensus
    Hajiaghayi, Mohammad T.
    Kowalski, Dariusz R.
    Olkowski, Jan
    PROCEEDINGS OF THE 54TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING (STOC '22), 2022, : 488 - 501
  • [5] Reconciling fault-tolerant distributed computing and systems-on-chip
    Fuegger, Matthias
    Schmid, Ulrich
    DISTRIBUTED COMPUTING, 2012, 24 (06) : 323 - 355
  • [6] An Efficient Placement and Routing Technique for Fault-Tolerant Distributed Embedded Computing
    Jafari, Roozbeh
    Ghasemzadeh, Hassan
    Dabiri, Foad
    Nahapetian, Ani
    Sarrafzadeh, Majid
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2009, 8 (04)
  • [7] Survey and future directions of fault-tolerant distributed computing on board spacecraft
    Fayyaz, Muhammad
    Vladimirova, Tanya
    ADVANCES IN SPACE RESEARCH, 2016, 58 (11) : 2352 - 2375
  • [8] Application controlled checkpointing coordination for fault-tolerant distributed computing systems
    Park, T
    Yeom, HY
    PARALLEL COMPUTING, 2000, 26 (04) : 467 - 482
  • [9] Distributed Fault-Tolerant Bipartite Output Synchronization of Discrete-Time Linear Multiagent Systems
    Zhang, Jie
    Ding, Da-Wei
    Lu, Yanrong
    Deng, Chao
    Ren, Yingying
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (02) : 1360 - 1373
  • [10] Deterministic fault-tolerant connectivity labeling scheme
    Izumi, Taisuke
    Emek, Yuval
    Wadayama, Tadashi
    Masuzawa, Toshimitsu
    DISTRIBUTED COMPUTING, 2025, 38 (01) : 31 - 50