LEARNING PROCESS BEHAVIOR FOR FAULT DETECTION

被引:4
作者
Pereira, Cassio M. M. [1 ]
De Mello, Rodrigo F. [1 ]
机构
[1] Univ Sao Paulo, Inst Math Sci & Comp, BR-13560970 Sao Paulo, Brazil
基金
巴西圣保罗研究基金会;
关键词
Fault tolerance; fault detection; process behavior;
D O I
10.1142/S0218213011000450
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, there has been an increased interest in self-healing systems. These types of systems are able to cope with failures in the environment they execute and work continuously by taking proactive actions to correct these problems. The detection of faults plays a prominent role in self-healing systems, as faults are the original causes of failures. Fault detection techniques proposed in the literature have been based on three mainstream approaches: process heartbeats, statistical analysis and machine learning. However, these approaches present limitations. Heartbeat-based techniques only detect failures, not faults. Statistical approaches generally assume linear models. Most machine learning techniques assume the data is independent and identically distributed. In order to overcome all these limitations we propose a new approach to address fault detection, which also gives insight into how process behavior changes over time in the presence of faults. Experiments show that the proposed approach achieves a twofold increase in F-measure when compared to Support Vector Machines (SVM) and Auto-Regressive Integrated Moving Average (ARIMA).
引用
收藏
页码:969 / 980
页数:12
相关论文
共 23 条
[1]  
Ahmed T., 2007, P 2 WORKSH TACKL COM, P7
[2]  
[Anonymous], 2002, Principal components analysis
[3]  
[Anonymous], 2011, R: A Language and Environment for Statistical Computing
[4]  
Apache Foundation, 2010, AP BENCHM TOOL
[5]   Failure detectors for large-scale distributed systems [J].
Hayashibara, N ;
Cherif, A ;
Katayama, T .
21ST IEEE SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 2002, :404-409
[6]  
Haykin S., 2008, NEURAL NETWORKS COMP
[7]   A best practice guide to resource forecasting for computing systems [J].
Hoffmann, Guenther A. ;
Trivedi, Kishor S. ;
Malek, Miroslaw .
IEEE TRANSACTIONS ON RELIABILITY, 2007, 56 (04) :615-628
[8]   Fault injection techniques and tools [J].
Hsueh, MC ;
Tsai, TK ;
Iyer, RK .
COMPUTER, 1997, 30 (04) :75-+
[9]   FERRARI - A FLEXIBLE SOFTWARE-BASED FAULT AND ERROR INJECTION SYSTEM [J].
KANAWATI, GA ;
KANAWATI, NA ;
ABRAHAM, JA .
IEEE TRANSACTIONS ON COMPUTERS, 1995, 44 (02) :248-260
[10]  
Lang Jean-Philippe, 2010, LIGHTTPD WEB SERVER