An Exploratory Study on How Non-Determinism in Large Language Models Affects Log Parsing

被引:3
作者
Astekin, Merve [1 ]
Hort, Max [1 ]
Moonen, Leon [1 ,2 ]
机构
[1] Simula Res Lab, Oslo, Norway
[2] BI Norwegian Business Sch, Oslo, Norway
来源
PROCEEDINGS 2024 IEEE/ACM 2ND INTERNATIONAL WORKSHOP ON INTERPRETABILITY, ROBUSTNESS, AND BENCHMARKING IN NEURAL SOFTWARE ENGINEERING, INTENSE 2024 | 2024年
关键词
log parsing; large language model; robustness; non-determinism; consistency;
D O I
10.1145/3643661.3643952
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Most software systems used in production generate system logs that provide a rich source of information about the status and execution behavior of the system. These logs are commonly used to ensure the reliability and maintainability of software systems. The first step toward automated log analysis is generally log parsing, which aims to transform unstructured log messages into structured log templates and extract the corresponding parameters. Recently, Large Language Models (LLMs) such as ChatGPT have shown promising results on a wide range of software engineering tasks, including log parsing. However, the extent to which non-determinism influences log parsing using LLMs remains unclear. In particular, it is important to investigate whether LLMs behave consistently when faced with the same log message multiple times. In this study, we investigate the impact of non-determinism in state-of-the-art LLMs while performing log parsing. Specifically, we select six LLMs, including both paid proprietary and free-touse models, and evaluate their non-determinism on 16 system logs obtained from a selection of mature open-source projects. The results of our study reveal varying degrees of non-determinism among models. Moreover, they show that there is no guarantee for deterministic results even with a temperature of zero.
引用
收藏
页码:13 / 18
页数:6
相关论文
共 29 条
[1]   DILAF: A framework for distributed analysis of large-scale system logs for anomaly detection [J].
Astekin, Merve ;
Zengin, Harun ;
Sozer, Hasan .
SOFTWARE-PRACTICE & EXPERIENCE, 2019, 49 (02) :153-170
[2]   BERT-Log: Anomaly Detection for System Logs Based on Pre-trained Language Model [J].
Chen, Song ;
Liao, Hai .
APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
[3]   Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis [J].
Fu, Qiang ;
Lou, Jian-Guang ;
Wang, Yi ;
Li, Jiang .
2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, :149-+
[4]   LogBERT: Log Anomaly Detection via BERT [J].
Guo, Haixuan ;
Yuan, Shuhan ;
Wu, Xintao .
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[5]   LogMine: Fast Pattern Recognition for Log Analytics [J].
Hamooni, Hossein ;
Debnath, Biplob ;
Xu, Jianwu ;
Zhang, Hui ;
Jiang, Guofei ;
Mueen, Abdullah .
CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, :1573-1582
[6]   Drain: An Online Log Parsing Approach with Fixed Depth Tree [J].
He, Pinjia ;
Zhu, Jieming ;
Zheng, Zibin ;
Lyu, Michael R. .
2017 IEEE 24TH INTERNATIONAL CONFERENCE ON WEB SERVICES (ICWS 2017), 2017, :33-40
[7]  
Jiang Juyong, 2023, CodeUp: A Multilingual Code Generation Llama2 Model with Parameter-Efficient Instruction-Tuning
[8]   An automated approach for abstracting execution logs to execution events [J].
Jiang, Zhen Ming ;
Hassan, Ahmed E. ;
Hamann, Gilbert ;
Flora, Parminder .
JOURNAL OF SOFTWARE MAINTENANCE AND EVOLUTION-RESEARCH AND PRACTICE, 2008, 20 (04) :249-267
[9]  
Jiang ZH, 2024, Arxiv, DOI [arXiv:2310.01796, DOI 10.48550/ARXIV.2310.01796]
[10]   Guidelines for Assessing the Accuracy of Log Message Template Identification Techniques [J].
Khan, Zanis Ali ;
Shin, Donghwan ;
Bianciilli, Domenico ;
Briand, Lionel .
2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, :1095-1106