DISTRIBUTED RECOVERY IN FAULT-TOLERANT MULTIPROCESSOR NETWORKS.

被引：0

作者：

Yanney, Raif M. ^{[1
]}

Hayes, John P. ^{[1
]}

机构：

[1] TRW, Redondo Beach, CA, USA, TRW, Redondo Beach, CA, USA

来源：

IEEE Transactions on Computers | 1986年 / C-35卷 / 10期

关键词：

MATHEMATICAL TECHNIQUES - Graph Theory;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

A methodology is developed for characterizing dynamic distributed recovery in fault-tolerant multiprocessor systems using graph theory. Distributed recovery, which is intended for systems with no central supervisor, depends on the cooperation of a set of processors to execute the recovery function since each processor is assumed to have only a limited amount of information about the system as a whole. Facility graphs, whose nodes denote the system components (processors), and whose edges denote the interconnection between components, are used to represent multiprocessor systems, and error conditions. A general distributed recovery strategy, R, which allows global recovery to be achieved via a sequence of local actions, is given. R recovers the system in several steps in which different nodes successively act as the local supervisor. R is specialized for two important class of systems: loop networks, and tree networks. For each of these cases, fault-tolerant designs and their associated distributed recovery strategies, which allow recovery from up to k faults within a specified number of steps, are presented.

引用

页码：871 / 879