SYNTHESIS OF ALGORITHM-BASED FAULT-TOLERANT SYSTEMS FOR DEPENDENCE GRAPHS

被引:11
|
作者
VINNAKOTA, B [1 ]
JHA, NK [1 ]
机构
[1] PRINCETON UNIV,DEPT ELECT ENGN,PRINCETON,NJ 08544
关键词
ALGORITHM-BASED FAULT TOLERANCE; CHECKSUM ENCODING; CONCURRENT ERROR DETECTION; DEPENDENCE GRAPHS; FAULT DETECTABILITY; FAULT LOCATABILITY; SYSTEM SYNTHESIS FOR FAULT TOLERANCE;
D O I
10.1109/71.238622
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Algorithm-Based Fault Tolerance (ABFT) is a scheme to improve the reliability of parallel architectures used for computation-intensive tasks. The exact implementation of an ABFT scheme is algorithm-dependent. ABFT systems have very low overhead compared to other fault tolerance schemes with similar benefits. Few results are available in the area of general synthesis of ABFT systems. A two-stage approach to the synthesis of ABFT systems is proposed. In the first stage a system-level code is chosen to encode the data used in the algorithm. In the second stage the optimal architecture to implement the scheme is chosen using dependence graphs. Dependence graphs are a graph-theoretic form of algorithm representation. We demonstrate that not all architectures are ideal for the implementation of a particular ABFT scheme. We propose new measures to characterize the fault tolerance capability of a system to better exploit the proposed synthesis method. Dependence graphs can also be used for the synthesis of ABFT schemes for non-linear problems. An example of a fault-tolerant median filter is provided to illustrate their utility for such problems.
引用
收藏
页码:864 / 874
页数:11
相关论文
共 50 条
  • [21] A Byzantine fault-tolerant mutual exclusion algorithm and its application to Byzantine fault-tolerant storage systems
    Kim, JM
    Manabe, Y
    25TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS, PROCEEDINGS, 2005, : 12 - 19
  • [22] FAULT-TOLERANT SYSTEMS
    AVIZIENIS, A
    IEEE TRANSACTIONS ON COMPUTERS, 1976, 25 (12) : 1304 - 1312
  • [23] FAULT-TOLERANT SYSTEMS
    SINGH, AD
    MURUGESAN, S
    COMPUTER, 1990, 23 (07) : 15 - 17
  • [24] Mapping a Fault-Tolerant Distributed Algorithm to Systems on Chip
    Fuchs, Gottfried
    Fuegger, Matthias
    Schmid, Ulrich
    Steininger, Andreas
    11TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN - ARCHITECTURES, METHODS AND TOOLS : DSD 2008, PROCEEDINGS, 2008, : 242 - 249
  • [25] Fault-tolerant graphs for hypercubes and tori
    Yamada, Toshinori
    Yamamoto, Koji
    Ueno, Shuichi
    IEICE Transactions on Information and Systems, 1996, E79-D (08) : 1147 - 1152
  • [26] Dependability Analysis for Fault-Tolerant Computer Systems Using Dynamic Fault Graphs
    Zhao Feng
    Jin Hai
    Zou Deqing
    Qin Pan
    CHINA COMMUNICATIONS, 2014, 11 (09) : 16 - 30
  • [27] On fault-tolerant partition dimension of graphs
    Azhar, Kamran
    Zafar, Sohail
    Kashif, Agha
    Zahid, Zohaib
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (01) : 1129 - 1135
  • [28] Fault-tolerant graphs for hypercubes and tori
    Yamada, T
    Yamamoto, K
    Ueno, S
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1996, E79D (08): : 1147 - 1152
  • [29] Fault-Tolerant Consensus in Directed Graphs
    Tseng, Lewis
    Vaidya, Nitin H.
    PODC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2015, : 451 - 460
  • [30] Fault-Tolerant Spanners for General Graphs
    Chechik, S.
    Langberg, M.
    Peleg, D.
    Roditty, L.
    STOC'09: PROCEEDINGS OF THE 2009 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2009, : 435 - 444