GPT-3-Powered Type Error Debugging: Investigating the Use of Large Language Models for Code Repair

被引：11

作者：

Ribeiro, Francisco ^{[1
]}

Castro de Macedo, Jose Nuno ^{[1
]}

Tsushima, Kanae ^{[2
]}

Abreu, Rui ^{[3
]}

Saraiva, Joao ^{[1
]}

机构：

[1] Univ Minho, HASLab, INESC TEC, Braga, Portugal

[2] Sokendai Univ, Natl Inst Informat, Tokyo, Japan

[3] Univ Porto, INESC ID, Porto, Portugal

来源：

PROCEEDINGS OF THE 16TH ACM SIGPLAN INTERNATIONAL CONFERENCE ON SOFTWARE LANGUAGE ENGINEERING, SLE 2023 | 2023年

关键词：

Automated Program Repair; GPT-3; Fault Localization; Code Generation;

D O I：

10.1145/3623476.3623522

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Type systems are responsible for assigning types to terms in programs. That way, they enforce the actions that can be taken and can, consequently, detect type errors during compilation. However, while they are able to flag the existence of an error, they often fail to pinpoint its cause or provide a helpful error message. Thus, without adequate support, debugging this kind of errors can take a considerable amount of effort. Recently, neural network models have been developed that are able to understand programming languages and perform several downstream tasks. We argue that type error debugging can be enhanced by taking advantage of this deeper understanding of the language's structure. In this paper, we present a technique that leverages GPT-3's capabilities to automatically fix type errors in OCaml programs. We perform multiple source code analysis tasks to produce useful prompts that are then provided to GPT-3 to generate potential patches. Our publicly available tool, Mentat, supports multiple modes and was validated on an existing public dataset with thousands of OCaml programs. We automatically validate successful repairs by using Quickcheck to verify which generated patches produce the same output as the user-intended fixed version, achieving a 39% repair rate. In a comparative study, Mentat outperformed two other techniques in automatically fixing ill-typed OCaml programs.

引用

页码：111 / 124

页数：14

共 52 条

[31] Lu S, 2021, Arxiv, DOI arXiv:2102.04664
[32] Lutellier Thibaud, 2020, ISSTA '20: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, P101, DOI 10.1145/3395363.3397369
[33] Ask the Mutants: Mutating Faulty Programs for Fault Localization
Moon, Seokhyeon
Kim, Yunho
Kim, Moonzoo
Yoo, Shin
[J]. 2014 IEEE SEVENTH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST), 2014, : 153 - 162
[34] Evaluating and improving fault localization
Pearson, Spencer
Campos, Jose
Just, Rene
Fraser, Gordon
Abreu, Rui
Ernst, Michael D.
Pang, Deric
Keller, Benjamin
[J]. 2017 IEEE/ACM 39TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2017, : 609 - 620
[35] Perez A, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1935
[36] Can OpenAI's Codex Fix Bugs? An evaluation on QuixBugs
Prenner, Julian Aron
Babii, Hlib
Robbes, Romain
[J]. INTERNATIONAL WORKSHOP ON AUTOMATED PROGRAM REPAIR (APR 2022), 2022, : 69 - 75
[37] Radford Alec, 2019, OpenAI blog, V1, P9
[38] Skalpel: A constraint-based type error slicer for Standard ML
Rahli, Vincent
Wells, Joe
Pirie, John
Kamareddine, Fairouz
[J]. JOURNAL OF SYMBOLIC COMPUTATION, 2017, 80 : 164 - 208
[39] Ribeiro Francisco, 2023, Figshare, DOI 10.6084/m9.figshare.23646903.v2
[40] Framing Program Repair as Code Completion
Ribeiro, Francisco
Abreu, Rui
Saraiva, Joao
[J]. INTERNATIONAL WORKSHOP ON AUTOMATED PROGRAM REPAIR (APR 2022), 2022, : 38 - 45

← 1 2 3 4 5 6 →