Fault Detection for Cloud Computing Systems with Correlation Analysis

被引:0
作者
Wang, Tao [1 ]
Zhang, Wenbo [1 ]
Wei, Jun [1 ]
Zhong, Hua [1 ]
机构
[1] Chinese Acad Sci, Inst Software, Beijing 100190, Peoples R China
来源
PROCEEDINGS OF THE 2015 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM) | 2015年
关键词
Software Monitoring; Performance Anomaly; Fault Detection; Cloud Computing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The large-scale dynamic cloud computing environment has raised great challenges for fault diagnosis in Web applications. First, fluctuating workloads cause traditional application models to change over time. Moreover, modeling the behaviors of complex applications always requires domain knowledge which is difficult to obtain. Finally, managing large-scale applications manually is impractical for operators. This paper addresses these issues and proposes an automatic fault diagnosis method for Web applications in cloud computing. We propose an online incremental clustering method to recognize access behavior patterns, and uses CCA to model the correlation between workloads and the metrics of application performance/resource utilization in a specific access behavior pattern. Our method detects anomalies by discovering the abrupt change of correlation coefficients with a EWMA control chart, and then locates suspicious metrics using a feature selection method combining ReliefF and SVM-RFE. We validate our method by injecting typical faults in TPC-W an industry-standard benchmark, and the experimental results demonstrate that it can effectively detect typical faults.
引用
收藏
页码:652 / 658
页数:7
相关论文
共 21 条
[1]  
Barham P., 2004, P 6 C S OSDI BERK CA, V6, P18
[2]  
Chen H., 2005, P 11 ACM SIGKDD INT, P750
[3]  
Chen MikeY., 2004, P 1 C S NETWORKED SY, P23
[4]  
Cohen I, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P231
[5]  
Gao Z, 2006, I C DEPEND SYS NETWO, P259
[6]   Canonical correlation analysis: An overview with application to learning methods [J].
Hardoon, DR ;
Szedmak, S ;
Shawe-Taylor, J .
NEURAL COMPUTATION, 2004, 16 (12) :2639-2664
[7]  
Hastie T., 2009, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, V2
[8]   Modeling and tracking of transaction flow dynamics for fault detection in complex systems [J].
Jiang, Guofei ;
Chen, Haifeng ;
Yoshihira, Kenji .
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2006, 3 (04) :312-326
[9]  
Jiang M, 2009, ACM/IEEE SIXTH INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING AND COMMUNICATIONS (ICAC '09), P13
[10]   Efficient Fault Detection and Diagnosis in Complex Software Systems with Information-Theoretic Monitoring [J].
Jiang, Miao ;
Munawar, Mohammad A. ;
Reidemeister, Thomas ;
Ward, Paul A. S. .
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2011, 8 (04) :510-522