On diagnosability of large fault sets in regular topology-based computer systems

被引:102
作者
Somani, AK
Peleg, O
机构
[1] UNIV WASHINGTON,DEPT COMP SCI & ENGN,SEATTLE,WA 98195
[2] RAFAEL,HAIFA,ISRAEL
关键词
fault diagnosis; system level diagnosis; t-diagnosable systems; tlk-diagnosable systems; degree of diagnosability; tlk-diagnosability of hypercubes; tlk-diagnosability of star-graph;
D O I
10.1109/12.536232
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The classical t-diagnosability approach has its limitation when dealing with large fault sets in large multiprocessor systems. This is due to limited diagnosability of large multiprocessor systems connected using regular interconnection structures. We propose an alternative approach to system diagnosis by allowing a few upper bounded number of units to be diagnosed incorrectly. This measure is called t/k-diagnosability. Using this new measure, it is possible to increase the degree of diagnosability of large system considerably. The t/k-diagnosis guarantees that all the faulty units (processors) in a system are detected (provided the number of faulty units does not exceed t) while at most k units are incorrectly diagnosed. We provide necessary and sufficient conditions for t/k-diagnosability and discuss their implication. To demonstrate the power of this approach, we analyze the diagnosability of large systems connected as hypercube, star-graph, and meshes. It is shown that a substantial increase in the degree of diagnosability of these structures is achieved, compared with the degree of diagnosability achieved using the classic t-diagnosability approach, at the cost of a comparably small number of incorrectly diagnosed units.
引用
收藏
页码:892 / 903
页数:12
相关论文
共 21 条
[1]   APPROACH TO DIAGNOSABILITY ANALYSIS OF A SYSTEM [J].
ALLAN, FJ ;
KAMEDA, T ;
TOIDA, S .
IEEE TRANSACTIONS ON COMPUTERS, 1975, 24 (10) :1040-1042
[2]  
ARMSTRONG JR, 1981, IEEE T COMPUT, V30, P587, DOI 10.1109/TC.1981.1675844
[3]  
BERMAN P, 1990, P 20 INT S FAULT TOL
[4]  
Blough D. M., 1992, Proceedings. Sixth International Parallel Processing Symposium (Cat. No.92TH0419-2), P398, DOI 10.1109/IPPS.1992.223013
[5]   REARRANGEABLE CIRCUIT-SWITCHED HYPERCUBE ARCHITECTURES FOR ROUTING PERMUTATIONS [J].
CHOI, SB ;
SOMANI, AK .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1993, 19 (02) :125-130
[6]   THE GENERALIZED FOLDING-CUBE NETWORK [J].
CHOI, SB ;
SOMANI, AK .
NETWORKS, 1991, 21 (03) :267-294
[7]   ON FAULT IDENTIFICATION IN DIAGNOSABLE SYSTEMS [J].
CHWA, KY ;
HAKIMI, SL .
IEEE TRANSACTIONS ON COMPUTERS, 1981, 30 (06) :414-422
[8]  
Friedman A. D., 1975, 1975 International Symposium on Fault-Tolerant Computing. Digest of papers, P167
[9]   PARTITIONING OF EVEN NETWORKS FOR IMPROVED DIAGNOSABILITY [J].
GHAFOOR, A .
IEEE TRANSACTIONS ON RELIABILITY, 1990, 39 (03) :281-286
[10]   CHARACTERIZATION OF CONNECTION ASSIGNMENT OF DIAGNOSABLE SYSTEMS [J].
HAKIMI, SL ;
AMIN, AT .
IEEE TRANSACTIONS ON COMPUTERS, 1974, C 23 (01) :86-88