A Convolutional Auto-encoder Method for Anomaly Detection on System Logs

被引:7
作者
Cui, Yu [1 ,2 ]
Sun, Yiping [1 ,2 ]
Hu, Jinglu [1 ]
Sheng, Gehao [2 ]
机构
[1] Waseda Univ, Grad Sch Informat Prod & Syst, 2-7 Hibikino, Kitakyushu, Fukuoka, Japan
[2] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, 800 Dongchuan Rd, Shanghai, Peoples R China
来源
2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) | 2018年
关键词
Log Analysis; Anomaly Detection; Feature Extraction; Auto-encoder; Ant Colony Optimization;
D O I
10.1109/SMC.2018.00519
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Anomaly detection on system logs is to report system failures with utilization of console logs collected from devices, which ensures the reliability of systems. Most previous researches split logs into sequential time windows and regarded each window as an independent instance for classification using popular machine learning methods like support vector machine(SVM), however, neglected the time patterns under logs. Those approaches also suffer from information loss due to the vector representation, and high dimensionality if there is a large number of log events. To make up these deficiencies, unlike most traditional methods that used a vector to represent a period behavior at the macro level, we construct a 2D matrix to reveal more detailed system behaviors in the time period by dividing each window into sequential subwindows. To provide a more efficient representation, we further use the ant colony optimization algorithm to find a highly-coupled event template as the horizontal index of the 2D window matrix to replace the disordered one. To capture time dependencies, a multi-module convolutional auto-encoder is configured as that different paralleled modules scan among different time intervals to extract information respectively. These features are then concatenated in latent space as the final input, which contains diversified time information, for classification by SVM. The experiments on Blue Gene/L log dataset showed that our proposed method outperforms the state-of-art SVM method.
引用
收藏
页码:3057 / 3062
页数:6
相关论文
共 18 条
  • [1] [Anonymous], 2015, ADV NEURAL INFORM PR
  • [2] Basseville M, 1993, DETECTION ABRUPT CHA
  • [3] Beitzel S. M., 2004, Proceedings of Sheffield SIGIR 2004. The Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P321, DOI 10.1145/1008992.1009048
  • [4] Charette R., STAGGERING IMPACT IT
  • [5] Dorigo M., 1997, IEEE Transactions on Evolutionary Computation, V1, P53, DOI 10.1109/4235.585892
  • [6] Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis
    Fu, Qiang
    Lou, Jian-Guang
    Wang, Yi
    Li, Jiang
    [J]. 2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 149 - +
  • [7] Glerum K, 2009, SOSP'09: PROCEEDINGS OF THE TWENTY-SECOND ACM SIGOPS SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, P103
  • [8] Hartigan J. A., 1979, Applied Statistics, V28, P100, DOI 10.2307/2346830
  • [9] Experience Report: System Log Analysis for Anomaly Detection
    He, Shilin
    Zhu, Jieming
    He, Pinjia
    Lyu, Michael R.
    [J]. 2016 IEEE 27TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE), 2016, : 207 - 218
  • [10] Failure prediction in IBM BlueGene/L event logs
    Liang, Yinglung
    Zhang, Yanyong
    Xiong, Hui
    Sahoo, Ramendra
    [J]. ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 583 - +