Execution anomaly detection in large-scale systems through console log analysis

被引:42
作者
Bao, Liang [1 ]
Li, Qian [2 ]
Lu, Peiyao [1 ]
Lu, Jie [1 ]
Ruan, Tongxiao [1 ]
Zhang, Ke [1 ]
机构
[1] XiDian Univ, Sch Software, Xian 710071, Shaanxi, Peoples R China
[2] Xi An Jiao Tong Univ, Sch Econ & Finance, Xian 710061, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Log analysis; Execution anomaly detection; Control flow analysis; Trace anomaly index;
D O I
10.1016/j.jss.2018.05.016
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Execution anomaly detection is important for development, maintenance and performance tuning in large-scale systems. System console logs are the significant source of troubleshooting and problem diagnosis. However, manually inspecting logs to detect anomalies is unfeasible due to the increasing volume and complexity of log files. Therefore, this is a substantial demand for automatic anomaly detection based on log analysis. In this paper, we propose a general method to mine console logs to detect system problems. We first give some formal definitions of the problem, and then extract the set of log statements in the source code and generate the reachability graph to reveal the reachable relations of log statements. After that, we parse the log files to create log messages by combining information about log statements with information retrieval techniques. These messages are grouped into execution traces according to their execution units. We propose a novel anomaly detection algorithm that considers traces as sequence data and uses a probabilistic suffix tree based method to organize and differentiate significant statistical properties possessed by the sequences. Experiments on a CloudStack testbed and a Hadoop production system show that our method can effectively detect running anomalies in comparison with existing four detection algorithms.
引用
收藏
页码:172 / 186
页数:15
相关论文
共 82 条
[1]  
Abreu R, 2008, APPLIED COMPUTING 2008, VOLS 1-3, P712
[2]  
Allen Frances E., 1970, ACM SIGPLAN NOTICES, V5, P1, DOI DOI 10.1145/390013.808479
[3]   Mining specifications [J].
Ammons, G ;
Bodík, R ;
Larus, JR .
ACM SIGPLAN NOTICES, 2002, 37 (01) :4-16
[4]  
[Anonymous], 2005, KDD, DOI [10.1145/1081870.1081927, DOI 10.1145/1081870.1081927]
[5]  
[Anonymous], 2013, P 29 ANN COMP SEC AP, DOI DOI 10.1145/2523649.2523670
[6]  
[Anonymous], 2013, FOUND TRENDS SIGNAL, DOI DOI 10.1561/2000000039
[7]  
[Anonymous], 2004, P 2004 ACM S APPL CO, DOI DOI 10.1145/967900.967989
[8]  
[Anonymous], 2008, Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
[9]  
[Anonymous], HOTCLOUD
[10]  
[Anonymous], THESIS