FlexFL: Flexible and Effective Fault Localization With Open-Source Large Language Models

被引：0

作者：

Xu, Chuyang ^{[1
]}

Liu, Zhongxin ^{[1
,2
]}

Ren, Xiaoxue ^{[1
,2
]}

Zhang, Gehao ^{[3
]}

Liang, Ming

Lo, David ^{[4
]}

机构：

[1] Zhejiang Univ, State Key Lab Blockchain & Data Secur, Hangzhou 310027, Peoples R China

[2] Hangzhou High Tech Zone Binjiang Inst Blockchain &, Hangzhou 310052, Peoples R China

[3] Ant Grp, Hangzhou 310013, Peoples R China

[4] Singapore Management Univ, Sch Comp & Informat Syst, Singapore 188065, Singapore

来源：

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING | 2025年 / 51卷 / 05期

基金：

中国国家自然科学基金; 新加坡国家研究基金会;

关键词：

Computer bugs; Location awareness; Codes; Debugging; Pipelines; Large language models; Training; Data privacy; Source coding; Software systems; Fault localization; large language model; LLM-based agent; BUG LOCALIZATION;

D O I：

10.1109/TSE.2025.3553363

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Fault localization (FL) targets identifying bug locations within a software system, which can enhance debugging efficiency and improve software quality. Due to the impressive code comprehension ability of Large Language Models (LLMs), a few studies have proposed to leverage LLMs to locate bugs, i.e., LLM-based FL, and demonstrated promising performance. However, first, these methods are limited in flexibility. They rely on bug-triggering test cases to perform FL and cannot make use of other available bug-related information, e.g., bug reports. Second, they are built upon proprietary LLMs, which are, although powerful, confronted with risks in data privacy. To address these limitations, we propose a novel LLM-based FL framework named FlexFL, which can flexibly leverage different types of bug-related information and effectively work with open-source LLMs. FlexFL is composed of two stages. In the first stage, FlexFL reduces the search space of buggy code using state-of-the-art FL techniques of different families and provides a candidate list of bug-related methods. In the second stage, FlexFL leverages LLMs to delve deeper to double-check the code snippets of methods suggested by the first stage and refine fault localization results. In each stage, FlexFL constructs agents based on open-source LLMs, which share the same pipeline that does not postulate any type of bug-related information and can interact with function calls without the out-of-the-box capability. Extensive experimental results on Defects4J demonstrate that FlexFL outperforms the baselines and can work with different open-source LLMs. Specifically, FlexFL with a lightweight open-source LLM Llama3-8B can locate 42 and 63 more bugs than two state-of-the-art LLM-based FL approaches AutoFL and AgentFL that both use GPT-3.5. In addition, FlexFL can localize 93 bugs that cannot be localized by non-LLM-based FL techniques at the top 1. Furthermore, to mitigate potential data contamination, we conduct experiments on a dataset which Llama3-8B has not seen before, and the evaluation results show that FlexFL can also achieve good performance.

引用

页码：1455 / 1471

页数：17

共 65 条

[1] On the accuracy of spectrum-based fault localization [J].

Abreu, Rui ;

Zoeteweij, Peter ;

van Gemund, Arjan J. C. .

TAIC PART 2007 - TESTING: ACADEMIC AND INDUSTRIAL CONFERENCE - PRACTICE AND RESEARCH TECHNIQUES, PROCEEDINGS: CO-LOCATED WITH MUTATION 2007, 2007, :89-+

[2]

2023, Arxiv, DOI [arXiv:2303.08774, 10.48550/arXiv.2303.08774., DOI 10.48550/ARXIV.2303.08774]

[3]

[Anonymous], 2024, Blog of Meta Llama 3

[4]

[Anonymous], 2024, Cutoff date of training dataset of Llama3

[5]

[Anonymous], 2024, Blog of Mistral-Nemo

[6]

[Anonymous], 2024, Our replication package

[7]

[Anonymous], 2024, Blog of Qwen2

[8]

[Anonymous], 2024, Model Card of Llama3-8B-Instruct

[9]

[Anonymous], 2010, Bug report of Time-25(Defects4J)

[10]

[Anonymous], 2024, Open LLM Leaderboard of HuggingFace

← 1 2 3 4 5 6 7 →