Reconciling fault-tolerant distributed computing and systems-on-chip

被引:22
|
作者
Fuegger, Matthias [1 ]
Schmid, Ulrich [1 ]
机构
[1] Tech Univ Wien, Embedded Comp Syst Grp E182 2, A-1040 Vienna, Austria
基金
奥地利科学基金会;
关键词
Clock synchronization; Fault-tolerant; distributed systems; Modeling approaches; VLSI; CLOCK SYNCHRONIZATION; SOFT ERRORS; DESIGN; IMPOSSIBILITY; ARCHITECTURE; CONSENSUS; CIRCUITS; ISSUES; TRENDS;
D O I
10.1007/s00446-011-0151-7
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Classic distributed computing abstractions do not match well the reality of digital logic gates, which are the elementary building blocks of Systems-on-Chip (SoCs) and other Very Large Scale Integrated (VLSI) circuits: Massively concurrent, continuous computations undermine the concept of sequential processes executing sequences of atomic zero-time computing steps, and very limited computational resources at gate-level make even simple operations prohibitively costly. In this paper, we introduce a modeling and analysis framework based on continuous computations and zero-bit message channels, and employ this framework for the correctness & performance analysis of a distributed fault-tolerant clocking approach for Systems-on-Chip (SoCs). Starting out from a "classic" distributed Byzantine fault-tolerant tick generation algorithm, we show how to adapt it for direct implementation in clockless digital logic, and rigorously prove its correctness and derive analytic expressions for worst case performance metrics like synchronization precision and clock frequency. Rather than on absolute delay values, both the algorithm's correctness and the achievable synchronization precision depend solely on the ratio of certain path delays. Since these ratios can be mapped directly to placement & routing constraints, there is typically no need for changing the algorithm when migrating to a faster implementation technology and/or when using a slightly different layout in an SoC.
引用
收藏
页码:323 / 355
页数:33
相关论文
共 50 条
  • [1] Reconciling fault-tolerant distributed computing and systems-on-chip
    Matthias Függer
    Ulrich Schmid
    Distributed Computing, 2012, 24 : 323 - 355
  • [2] Rigorously modeling self-stabilizing fault-tolerant circuits: An ultra-robust clocking scheme for systems-on-chip
    Dolev, Danny
    Fuegger, Matthias
    Posch, Markus
    Schmid, Ulrich
    Steininger, Andreas
    Lenzen, Christoph
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2014, 80 (04) : 860 - 900
  • [3] Reconciling fault-tolerant distributed algorithms and real-time computing
    Moser, Heinrich
    Schmid, Ulrich
    DISTRIBUTED COMPUTING, 2014, 27 (03) : 203 - 230
  • [4] Evaluation of fault-tolerant distributed web systems
    Hong, YS
    No, JH
    Han, I
    WORDS 2005: 10th IEEE International Workshop on Object-Oriented Real-Time Dependable, Proceedings, 2005, : 148 - 151
  • [5] Adaptive distributed and fault-tolerant systems
    Hiltunen, MA
    Schlichting, RD
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 1996, 11 (05): : 275 - 285
  • [6] An adaptive programming model for fault-tolerant distributed computing
    Gorender, Sergio
    Macedo, Raimundo Jose de Araujo
    Raynal, Michel
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2007, 4 (01) : 18 - 31
  • [7] A hybrid and adaptive model for fault-tolerant distributed computing
    Gorender, S
    Macêdo, R
    Raynal, M
    2005 INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2005, : 412 - 421
  • [8] On fault-tolerant data replication in distributed systems
    Tenzekhti, F
    Day, K
    Ould-Khaoua, M
    MICROPROCESSORS AND MICROSYSTEMS, 2002, 26 (07) : 301 - 309
  • [9] Deterministic Fault-Tolerant Distributed Computing in Linear Time and Communication
    Chlebus, Bogdan S.
    Kowalski, Dariusz R.
    Olkowski, Jan
    PROCEEDINGS OF THE 2023 ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, PODC 2023, 2023, : 344 - 354
  • [10] Distributed Fault-Tolerant Average-Tracking for Linear Multi-Agent Systems
    Li, Yu-Ling
    Liu, Cheng-Lin
    2023 5TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS, ICCR, 2023, : 245 - 249