Natural Language Processing-based Model for Log Anomaly Detection

被引:3
作者
Li, Zezhou [1 ]
Zhang, Jing [1 ]
Zhang, Xianbo [1 ]
Lin, Feng [1 ]
Wang, Chao [1 ]
Cai, Xingye [1 ]
机构
[1] JD Tech, Beijing, Peoples R China
来源
2022 2ND IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND ARTIFICIAL INTELLIGENCE (SEAI 2022) | 2022年
关键词
log anomaly detection; natural language processing; deep neural networks;
D O I
10.1109/SEAI55746.2022.9832400
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Logs are widely used in IT industry and the anomaly detection of logs is essential to identify the running status of systems. Conventional methods solving this problem require sophisticated rule-based regulations and intensive labor input. In this paper, we propose a new model based on natural language processing techniques. In order to modify the feature extraction and to improve the vector quality of log templates, Part-of-Speech (PoS) and Named Entity Recognition (NER) are employed in our model, which leads to the less involvement of regulation-based rule and a modification of the template vector thanks to the weight vector by NER. The PoS property of each token in the template is firstly analyzed, which also reduces labor involvement and helps for better weight allocation. The weight investigation on tokens of the template is introduced to modify the template vector. And the final detection based on the modified vector of templates is realized by deep neural networks (DNNs). The effectiveness of our model is tested on three datasets, and compared with two state-ofthe-art models. The evaluation results prove that our model achieves better log anomaly detection.
引用
收藏
页码:129 / 134
页数:6
相关论文
共 24 条
[1]  
[Anonymous], BGL DATASET
[2]  
[Anonymous], 2010, Python Text Processing with NLTK 2.0 Cookbook
[3]  
Chen ZB, 2022, Arxiv, DOI arXiv:2107.05908
[4]   DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning [J].
Du, Min ;
Li, Feifei ;
Zheng, Guineng ;
Srikumar, Vivek .
CCS'17: PROCEEDINGS OF THE 2017 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2017, :1285-1298
[5]  
Joulin A, 2016, Arxiv, DOI [arXiv:1612.03651, DOI 10.48550/ARXIV.1612.03651]
[6]  
Lafferty J.D., 2001, ICML, P282
[7]   A Survey on Deep Learning for Named Entity Recognition [J].
Li, Jing ;
Sun, Aixin ;
Han, Jianglei ;
Li, Chenliang .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (01) :50-70
[8]  
Liang Jie, 2019, Journal of Tsinghua University (Science and Technology), V59, P523, DOI 10.16511/j.cnki.qhdxxb.2018.25.061
[9]  
Loper E., 2002, arXiv
[10]   Detecting Anomaly in Big Data System Logs Using Convolutional Neural Network [J].
Lu, Siyang ;
Wei, Xiang ;
Li, Yandong ;
Wang, Liqiang .
2018 16TH IEEE INT CONF ON DEPENDABLE, AUTONOM AND SECURE COMP, 16TH IEEE INT CONF ON PERVAS INTELLIGENCE AND COMP, 4TH IEEE INT CONF ON BIG DATA INTELLIGENCE AND COMP, 3RD IEEE CYBER SCI AND TECHNOL CONGRESS (DASC/PICOM/DATACOM/CYBERSCITECH), 2018, :151-158