A Survey of Comparison-Based System-Level Diagnosis

被引:52
作者
Duarte, Elias P., Jr. [1 ]
Ziwich, Roverli P. [1 ]
Albini, Luiz C. P. [1 ]
机构
[1] Univ Fed Parana, Dept Informat, BR-80060000 Curitiba, Parana, Brazil
关键词
Algorithms; Reliability; Security; Comparison-based diagnosis; multiprocessor systems; dependability; AD HOC NETWORKS; FAULT-DIAGNOSIS; MULTIPROCESSOR SYSTEMS; SEQUENTIAL DIAGNOSABILITY; EVOLUTIONARY ALGORITHM; CONNECTION ASSIGNMENT; PRODUCT NETWORKS; IDENTIFICATION; CUBE; TOLERANCE;
D O I
10.1145/1922649.1922659
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The growing complexity and dependability requirements of hardware, software, and networks demand efficient techniques for discovering disruptive behavior in those systems. Comparison-based diagnosis is a realistic approach to detect faulty units based on the outputs of tasks executed by system units. This survey integrates the vast amount of research efforts that have been produced in this field, from the earliest theoretical models to new promising applications. Key results also include the quantitative evaluation of a relevant reliability metric-the diagnosability-of several popular interconnection network topologies. Relevant diagnosis algorithms are also described. The survey aims at clarifying and uncovering the potential of this technology, which can be applied to improve the dependability of diverse complex computer systems.
引用
收藏
页数:56
相关论文
共 133 条
[1]  
Abrougui K, 2005, 11th International Conference on Parallel and Distributed Systems, Vol I, Proceedings, P78
[2]   A GROUP-THEORETIC MODEL FOR SYMMETRIC INTERCONNECTION NETWORKS [J].
AKERS, SB ;
KRISHNAMURTHY, B .
IEEE TRANSACTIONS ON COMPUTERS, 1989, 38 (04) :555-566
[3]   Diagnosis of symmetric graphs under the BGM model [J].
Albini, LCP ;
Chessa, S ;
Maestrini, P .
COMPUTER JOURNAL, 2004, 47 (01) :85-92
[4]  
ALBINI LCP, 2001, P 2 IEEE LAT AM TEST, P285
[5]   Reliable routing in wireless ad hoc networks: The virtual routing protocol [J].
Albini, Luiz Carlos P. ;
Caruso, Antonio ;
Chessa, Stefano ;
Maestrini, Piero .
JOURNAL OF NETWORK AND SYSTEMS MANAGEMENT, 2006, 14 (03) :335-358
[6]  
Amaral JLM, 2004, 2004 NASA/DOD CONFERENCE ON EVOLVABLE HARDWARE, PROCEEDINGS, P138
[7]  
Ammann E., 1981, SELF DIAGNOSIS FAULT, P1
[8]  
[Anonymous], 2005, J BRAZ COMP SOC
[9]  
[Anonymous], 2006, P 4 ACM INT WORKSH M
[10]  
[Anonymous], 1994, Fault Tolerance in Distributed Systems