Logan: A Distributed Online Log Parser

被引:19
作者
Agrawal, Amey [1 ]
Karlupia, Rohit [1 ]
Gupta, Rajat [1 ]
机构
[1] Qubole India Pvt Ltd, Bengaluru, India
来源
2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019) | 2019年
关键词
Log parsing; Online algorithm; Distributed processing;
D O I
10.1109/ICDE.2019.00211
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Logs serve as a critical tool for debugging and monitoring applications. However, gaining insights from unstructured logs is difficult. Hence, many log management and analysis applications first parse logs into structured templates. In this paper, we train a data-driven log parser on our new Apache Spark dataset, the largest application log dataset yet. We implement a distributed online algorithm to accommodate for the large volume of data. We also devise a new metric for evaluation of parsers when labeled data is unavailable. We show that our method generalizes over diverse datasets without any parameter tuning or domain-specific inputs from the user. When evaluated on publicly available HDFS dataset our method performs 13x faster than the previous state-of-the-art.
引用
收藏
页码:1946 / 1951
页数:6
相关论文
共 15 条
[1]  
Armbrust M, 2015, PROC VLDB ENDOW, V8, P1840
[2]  
Du M, 2016, IEEE DATA MINING, P859, DOI [10.1109/ICDM.2016.0103, 10.1109/ICDM.2016.160]
[3]   Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis [J].
Fu, Qiang ;
Lou, Jian-Guang ;
Wang, Yi ;
Li, Jiang .
2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, :149-+
[4]  
He P, 2018, ARXIV180604356
[5]  
He P., 2017, IEEE T DEPENDABLE SE
[6]  
LANS L., OPERATIONAL DATA SUP
[7]  
Makanju A., IEEE T KNOWLEDGE DAT, V24, P1921
[8]  
Mizutani Masayoshi, 2013, 2013 IEEE International Conference on Services Computing (SCC), P595, DOI 10.1109/SCC.2013.73
[9]   What supercomputers say: A study of five system logs [J].
Oliner, Adam ;
Stearley, Jon .
37TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2007, :575-+
[10]  
Shvachko K, 2010, IEEE S MASS STOR SYS