Evaluating Fault Localization and Program Repair Capabilities of Existing Closed-Source General-Purpose LLMs

被引:3
作者
Jiang, Shengbei [1 ]
Zhang, Jiabao [1 ]
Chen, Wei [1 ]
Wang, Bo [1 ]
Zhou, Jianyi [2 ]
Zhang, Jie [3 ]
机构
[1] Beijing Jiaotong Univ, Beijing, Peoples R China
[2] Huawei Cloud Comp Technol Co Ltd, Beijing, Peoples R China
[3] Kings Coll London, London, England
来源
2024 INTERNATIONAL WORKSHOP ON LARGE LANGUAGE MODELS FOR CODE, LLM4CODE 2024 | 2024年
基金
中国国家自然科学基金;
关键词
Large Language Model; Fault Localization; Program Repair; Software Debugging;
D O I
10.1145/3643795.3648390
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automated debugging is an emerging research field that aims to automatically find and repair bugs. In this field, Fault Localization (FL) and Automated Program Repair (APR) gain the most research efforts. Most recently, researchers have adopted pre-trained Large Language Models (LLMs) to facilitate FL and APR and their results are promising. However, the LLMs they used either vanished (such as Codex) or outdated (such as early versions of GPT). In this paper, we evaluate the performance of recent commercial closed-source general-purpose LLMs on FL and APR, i.e., ChatGPT 3.5, ERNIE Bot 3.5, and IFlytek Spark 2.0. We select three popular LLMs and evaluate them on 120 real-world Java bugs from the benchmark Defects4J. For FL and APR, we designed three kinds of prompts for each, considering different kinds of information. The results show that these LLMs could successfully locate 53.3% and correctly fix 12.5% of these bugs.
引用
收藏
页码:75 / 78
页数:4
相关论文
共 23 条
[1]   On the Effectiveness of Unified Debugging: An Extensive Study on 16 Program Repair Systems [J].
Benton, Samuel ;
Li, Xia ;
Lou, Yiling ;
Zhang, Lingming .
2020 35TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2020), 2020, :907-918
[2]  
Cao JL, 2023, Arxiv, DOI arXiv:2304.08191
[3]  
Fan AEL, 2023, Arxiv, DOI arXiv:2310.03533
[4]   Automated Repair of Programs from Large Language Models [J].
Fan, Zhiyu ;
Gao, Xiang ;
Mirchev, Martin ;
Roychoudhury, Abhik ;
Tan, Shin Hwei .
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE, 2023, :1469-1481
[5]   Automatic Software Repair: A Survey [J].
Gazzola, Luca ;
Micucci, Daniela ;
Mariani, Leonardo .
PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2018, :1219-1219
[6]  
Just R., 2014, P 2014 INT S SOFTW T, P437, DOI DOI 10.1145/2610384.2628055
[7]  
Kang S, 2024, Arxiv, DOI arXiv:2308.05487
[8]   DeepFL: Integrating Multiple Fault Diagnosis Dimensions for Deep Fault Localization [J].
Li, Xia ;
Li, Wei ;
Zhang, Yuqun ;
Zhang, Lingming .
PROCEEDINGS OF THE 28TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS (ISSTA '19), 2019, :169-180
[9]   Can Automated Program Repair Refine Fault Localization? A Unified Debugging Approach [J].
Lou, Yiling ;
Ghanbari, Ali ;
Li, Xia ;
Zhang, Lingming ;
Zhang, Haotian ;
Hao, Dan ;
Zhang, Lu .
PROCEEDINGS OF THE 29TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2020, 2020, :75-87
[10]   Boosting Coverage-Based Fault Localization via Graph-Based Representation Learning [J].
Lou, Yiling ;
Zhu, Qihao ;
Dong, Jinhao ;
Li, Xia ;
Sun, Zeyu ;
Hao, Dan ;
Zhang, Lu ;
Zhang, Lingming .
PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), 2021, :664-676