Reconciling fault-tolerant distributed computing and systems-on-chip

被引:22
|
作者
Fuegger, Matthias [1 ]
Schmid, Ulrich [1 ]
机构
[1] Tech Univ Wien, Embedded Comp Syst Grp E182 2, A-1040 Vienna, Austria
基金
奥地利科学基金会;
关键词
Clock synchronization; Fault-tolerant; distributed systems; Modeling approaches; VLSI; CLOCK SYNCHRONIZATION; SOFT ERRORS; DESIGN; IMPOSSIBILITY; ARCHITECTURE; CONSENSUS; CIRCUITS; ISSUES; TRENDS;
D O I
10.1007/s00446-011-0151-7
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Classic distributed computing abstractions do not match well the reality of digital logic gates, which are the elementary building blocks of Systems-on-Chip (SoCs) and other Very Large Scale Integrated (VLSI) circuits: Massively concurrent, continuous computations undermine the concept of sequential processes executing sequences of atomic zero-time computing steps, and very limited computational resources at gate-level make even simple operations prohibitively costly. In this paper, we introduce a modeling and analysis framework based on continuous computations and zero-bit message channels, and employ this framework for the correctness & performance analysis of a distributed fault-tolerant clocking approach for Systems-on-Chip (SoCs). Starting out from a "classic" distributed Byzantine fault-tolerant tick generation algorithm, we show how to adapt it for direct implementation in clockless digital logic, and rigorously prove its correctness and derive analytic expressions for worst case performance metrics like synchronization precision and clock frequency. Rather than on absolute delay values, both the algorithm's correctness and the achievable synchronization precision depend solely on the ratio of certain path delays. Since these ratios can be mapped directly to placement & routing constraints, there is typically no need for changing the algorithm when migrating to a faster implementation technology and/or when using a slightly different layout in an SoC.
引用
收藏
页码:323 / 355
页数:33
相关论文
共 50 条
  • [21] Fault-tolerant scheduling in distributed real-time systems
    Satyanarayana, NV
    Mall, R
    Pal, A
    2001 INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND MOBILE COMPUTING, PROCEEDINGS, 2001, : 275 - 280
  • [22] THE CONSENSUS PROBLEM IN FAULT-TOLERANT COMPUTING
    BARBORAK, M
    MALEK, M
    DAHBURA, A
    COMPUTING SURVEYS, 1993, 25 (02) : 171 - 220
  • [23] Fault-tolerant broadcast in anonymous systems
    Jimenez, Ernesto
    Arevalo, Sergio
    Tang, Jian
    JOURNAL OF SUPERCOMPUTING, 2015, 71 (11) : 4172 - 4191
  • [24] STRUCTURING FAULT-TOLERANT OBJECT SYSTEMS FOR MODULARITY IN A DISTRIBUTED ENVIRONMENT
    SHRIVASTAVA, SK
    MCCUE, DL
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1994, 5 (04) : 421 - 432
  • [25] Fault-Tolerant Leader Election in Mobile Dynamic Distributed Systems
    Gomez-Calzado, Carlos
    Lafuente, Alberto
    Larrea, Mikel
    Raynal, Michel
    2013 IEEE 19TH PACIFIC RIM INTERNATIONAL SYMPOSIUM ON DEPENDABLE COMPUTING (PRDC 2013), 2013, : 78 - 87
  • [26] An approach to fault-tolerant mobile agent execution in distributed systems
    Mohammadi, K.
    Hamidi, H.
    2005 1ST IEEE/IFIP INTERNATIONAL CONFERENCE IN CENTRAL ASIA ON INTERNET (ICI), 2005, : 159 - 163
  • [27] Distributed Fault Identification and Fault-Tolerant Control for Multi-agent Systems
    Feng, Zhi
    Hu, Guoqiang
    2014 33RD CHINESE CONTROL CONFERENCE (CCC), 2014, : 1476 - 1481
  • [28] Cooperative Fault-Tolerant Output Regulation for Multiagent Systems by Distributed Learning Control Approach
    Deng, Chao
    Che, Wei-Wei
    Shi, Peng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4831 - 4841
  • [29] Fault-Tolerant Procedures for Redundant Computer Systems
    Samet, Refik
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2009, 25 (01) : 41 - 68
  • [30] Real-time fault-tolerant scheduling in heterogeneous distributed systems
    Qin, X
    Han, ZF
    Pang, LP
    Li, SL
    Jin, H
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, 2000, : 421 - 427