COOPERATIVE DIAGNOSIS AND ROUTING IN FAULT-TOLERANT MULTIPROCESSOR SYSTEMS

被引:1
|
作者
BLOUGH, DM
WANG, HY
机构
[1] Department of Electrical and Computer Engineering, University of California, Irvine
关键词
D O I
10.1006/jpdc.1995.1083
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this note, we consider the problem of fault-tolerant routing in multiprocessor systems when incomplete, or partial, diagnostic information is available. We first define a new type of partial diagnosis, known as k-reachability diagnosis. The overhead for k-reachability diagnosis increases with k, which specifies the radius of diagnostic information maintained by each node. We then present a routing algorithm, known as Algorithm Partial Route, that makes use of k-reachability diagnostic information and allows a trade-off between the amount of diagnostic information and the quality of routing. Partial Route is the first algorithm capable of handling systems of arbitrary topology containing an arbitrary number of faults. The worst-case performance of the algorithm in an n-node system, is shown to be optimal when k = n - 1 and within a factor of 2 of optimal when k = 1. Simulation results on meshes and hypercubes are also presented that show, in the average case, Algorithm Partial Route is nearly optimal for relatively small values of k. (C) 1995 Academic Press, Inc.
引用
收藏
页码:205 / 211
页数:7
相关论文
共 50 条
  • [21] Fault-tolerant hierarchical routing
    Alari, G
    Datta, A
    Derby, J
    Lawrence, J
    1977 IEEE INTERNATIONAL PERFORMANCE, COMPUTING AND COMMUNICATIONS CONFERENCE, 1997, : 159 - 165
  • [22] A survey on cooperative fault-tolerant control for multiagent systems
    Zhang, Pu
    Zhao, Di
    Kong, Xiangjie
    Zhang, Jialong
    Li, Lei
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2024, 18 (06): : 1431 - 1448
  • [23] Fault-tolerant convergence routing
    Yener, B
    Bhandari, I
    Ofek, Y
    Yung, M
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1997, 42 (02) : 173 - 183
  • [24] Fault-Tolerant Routing in Bicubes
    Wang, Yitong
    Kyaw, Htoo Htoo Sandi
    Fujiyoshi, Kunihiro
    Kaneko, Keiichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2025, E108D (01) : 74 - 81
  • [25] Cooperative Fault-Tolerant Control of Networked Control Systems
    Schenk, Kai
    Guelbitti, Baris
    Lunze, Jan
    IFAC PAPERSONLINE, 2018, 51 (24): : 570 - 577
  • [26] Toward Efficient Design Space Exploration for Fault-Tolerant Multiprocessor Systems
    Yuan, Bo
    Chen, Huanhuan
    Yao, Xin
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2020, 24 (01) : 157 - 169
  • [27] Fault-tolerant expansion of system area networks in multiprocessor computer systems
    A. B. Nikolaev
    V. S. Podlazov
    Automation and Remote Control, 2008, 69 : 150 - 157
  • [28] Fault-tolerant expansion of system area networks in multiprocessor computer systems
    Nikolaev, A. B.
    Podlazov, V. S.
    AUTOMATION AND REMOTE CONTROL, 2008, 69 (01) : 150 - 157
  • [29] Fault-tolerant expansion of system area networks in multiprocessor computer systems
    Trapeznikov Institute of Control Sciences, Russian Academy of Sciences, Moscow, Russia
    Autom. Remote Control, 2008, 1 (150-157):
  • [30] Efficient fault-tolerant scheduling on multiprocessor systems via replication and deallocation
    Zhang, Jun
    Sha, Edwin H-M.
    Zhuge, Qingfeng
    Yi, Juan
    Wu, Kaijie
    INTERNATIONAL JOURNAL OF EMBEDDED SYSTEMS, 2014, 6 (2-3) : 216 - 224