COOPERATIVE DIAGNOSIS AND ROUTING IN FAULT-TOLERANT MULTIPROCESSOR SYSTEMS

被引:1
|
作者
BLOUGH, DM
WANG, HY
机构
[1] Department of Electrical and Computer Engineering, University of California, Irvine
关键词
D O I
10.1006/jpdc.1995.1083
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this note, we consider the problem of fault-tolerant routing in multiprocessor systems when incomplete, or partial, diagnostic information is available. We first define a new type of partial diagnosis, known as k-reachability diagnosis. The overhead for k-reachability diagnosis increases with k, which specifies the radius of diagnostic information maintained by each node. We then present a routing algorithm, known as Algorithm Partial Route, that makes use of k-reachability diagnostic information and allows a trade-off between the amount of diagnostic information and the quality of routing. Partial Route is the first algorithm capable of handling systems of arbitrary topology containing an arbitrary number of faults. The worst-case performance of the algorithm in an n-node system, is shown to be optimal when k = n - 1 and within a factor of 2 of optimal when k = 1. Simulation results on meshes and hypercubes are also presented that show, in the average case, Algorithm Partial Route is nearly optimal for relatively small values of k. (C) 1995 Academic Press, Inc.
引用
收藏
页码:205 / 211
页数:7
相关论文
共 50 条
  • [41] AN OPERATING SYSTEM FOR A FAULT-TOLERANT MULTIPROCESSOR CONTROLLER
    WILLIAMS, RD
    JOHNSON, BW
    ROBERTS, TE
    IEEE MICRO, 1988, 8 (04) : 18 - 29
  • [42] A FAULT-TOLERANT MULTIPROCESSOR CONTROLLER FOR MAGNETIC BEARINGS
    YATES, SW
    WILLIAMS, RD
    IEEE MICRO, 1988, 8 (04) : 6 - 17
  • [43] Integrated Fault Diagnosis and Fault-Tolerant for Constrained Dynamic Systems
    Witczak, Marcin
    Pazera, Marcin
    ADVANCED SOLUTIONS IN DIAGNOSTICS AND FAULT TOLERANT CONTROL, 2018, 635 : 17 - 32
  • [44] Fault-tolerant cooperative tasking for multi-agent systems
    Karimadini, Mohammad
    Lin, Hai
    INTERNATIONAL JOURNAL OF CONTROL, 2011, 84 (12) : 2092 - 2107
  • [45] An extended generalized hypercube as a fault-tolerant system area network for multiprocessor systems
    M. F. Karavay
    V. S. Podlazov
    Automation and Remote Control, 2015, 76 : 336 - 352
  • [46] Fault-tolerant partitioning scheduling algorithms in real-time multiprocessor systems
    Beitollahi, Hakem
    Deconinck, Geert
    12TH PACIFIC RIM INTERNATIONAL SYMPOSIUM ON DEPENDABLE COMPUTING, PROCEEDINGS, 2006, : 296 - +
  • [47] Fault-tolerant routing in the star graph
    Rezazad, SM
    Sarbazi-Azad, H
    18TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 2 (REGULAR PAPERS), PROCEEDINGS, 2004, : 503 - 506
  • [48] Fault-tolerant message routing for multiprocessors
    Zakrevski, L
    Karpovsky, M
    PARALLEL AND DISTRIBUTED PROCESSING, 1998, 1388 : 714 - 730
  • [49] FAULT-TOLERANT ROUTING IN MESH ARCHITECTURES
    OLSON, A
    SHIN, KG
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1994, 5 (11) : 1225 - 1232
  • [50] Design and implementation of fault-tolerant and cost effective crossbar switches for multiprocessor systems
    Wang, K
    Wu, CK
    IEE PROCEEDINGS-COMPUTERS AND DIGITAL TECHNIQUES, 1999, 146 (01): : 50 - 56