The Effectiveness of Compact Fine-Tuned LLMs in Log Parsing

被引：0

作者：

Mehrabi, Maryam ^{[1
]}

Hamou-Lhadj, Abdelwahab ^{[1
]}

Savi, Hossein Moo ^{[2
]}

机构：

[1] Concordia Univ, ECE, Montreal, PQ, Canada

[2] Cisco Syst, Ottawa, ON, Canada

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION, ICSME 2024 | 2024年

基金：

加拿大自然科学与工程研究理事会;

关键词：

Log Parsing; Large Language Models; Machine Learning; Software Maintenance and Evolution;

D O I：

10.1109/ICSME58944.2024.00047

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Log parsing is defined as the process of extracting structured information from unstructured log data. It is an important step prior to many log analytics tasks. The emergence of Large Language Models (LLMs), like Generative Pre-trained Transformers (GPTs), has driven the development of novel log parsing methods. Existing studies have examined the effectiveness of large-scale general-purpose LLMs in log parsing. In this paper, we argue that the long-term adoption of such LLMs pose challenges of data privacy, cost, and tool integration. To address these challenges, we explore the viability of supervised fine-tuning of an open-source compact LLM for log parsing as a prospective alternative. To this end, we fine-tune the Mistral-7B-Instruct LLM on a diverse set of log files and evaluate its performance, in terms of both accuracy and robustness, against OpenAI's GPT-4-Turbo using different configuration settings. We apply two evaluation approaches, namely metric-based and LLM-based. Our overall findings show that fine-tuning a compact LLM such as Mistral-7B provides similar and sometimes better results than using a large-scale LLM, in our case GPT-4-Turbo. These findings are important because they enable companies to use a smaller LLM that they can readily adapt to parsing their log data, and integrate into their log analytics tools, without the need to rely on third-party LLM providers.

引用

页码：438 / 448

页数：11

共 38 条

[1] Recommending Root-Cause and Mitigation Steps for Cloud Incidents using Large Language Models [J].

Ahmed, Toufique ;

Ghosh, Supriyo ;

Bansal, Chetan ;

Zimmermann, Thomas ;

Zhang, Xuchao ;

Rajmohan, Saravan .

2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, :1737-1749

[2]

Baktash JA, 2023, Arxiv, DOI arXiv:2305.03195

[3]

Brown TB, 2020, ADV NEUR IN, V33

[4]

Cao YH, 2023, Arxiv, DOI [arXiv:2303.04226, DOI 10.48550/ARXIV.2303.04226, 10.48550/arXiv.2303.04226]

[5] A Survey of Software Log Instrumentation [J].

Chen, Boyuan ;

Jiang, Zhen Ming .

ACM COMPUTING SURVEYS, 2021, 54 (04)

[6]

Chen YF, 2024, PROCEEDINGS OF THE 2024 EUROPEAN CONFERENCE ON COMPUTER SYSTEMS, EUROSYS 2024, P674, DOI [10.1007/978-981-97-3324-8_57, 10.1145/3627703.3629553]

[7]

Cheng Q, 2023, Arxiv, DOI arXiv:2304.04661

[8] Logram: Efficient Log Parsing Using n-Gram Dictionaries [J].

Dai, Hetong ;

Li, Heng ;

Chen, Che Shao ;

Shang, Weiyi ;

Chen, Tse-Hsun .

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (03) :879-892

[9]

Dettmers T., 2024, Advances in Neural Information Processing Systems, V36

[10]

Devlin J, 2019, Arxiv, DOI arXiv:1810.04805

← 1 2 3 4 →