Rete: Learning Namespace Representation for Program Repair

被引:5
作者
Parasaram, Nikhil [1 ]
Barr, Earl T. [1 ]
Mechtaev, Sergey [1 ]
机构
[1] UCL, London, England
来源
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE | 2023年
关键词
Program Repair; Deep Learning; Patch Prioritisation; Variable Representation; AUTOMATED REPAIR;
D O I
10.1109/ICSE48619.2023.00112
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A key challenge of automated program repair is finding correct patches in the vast search space of candidate patches. Real-world programs define large namespaces of variables that considerably contributes to the search space explosion. Existing program repair approaches neglect information about the program namespace, which makes them inefficient and increases the chance of test-overfitting. We propose RETE, a new program repair technique, that learns project-independent information about program namespace and uses it to navigate the search space of patches. RETE uses a neural network to extract project-independent information about variable CDU chains, defuse chains augmented with control flow. Then, it ranks patches by jointly ranking variables and the patch templates into which the variables are inserted. We evaluated RETE on 142 bugs extracted from two datasets, ManyBugs and BugsInPy. Our experiments demonstrate that RETE generates six new correct patches that fix bugs that previous tools did not repair, an improvement of 31% and 59% over the existing state of the art.
引用
收藏
页码:1264 / 1276
页数:13
相关论文
共 72 条
  • [1] On the accuracy of spectrum-based fault localization
    Abreu, Rui
    Zoeteweij, Peter
    van Gemund, Arjan J. C.
    [J]. TAIC PART 2007 - TESTING: ACADEMIC AND INDUSTRIAL CONFERENCE - PRACTICE AND RESEARCH TECHNIQUES, PROCEEDINGS: CO-LOCATED WITH MUTATION 2007, 2007, : 89 - +
  • [2] Afzal A., 2019, IEEE Transactions on Software Engineering
  • [3] Ahmad WU, 2021, 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), P2655
  • [4] code2vec: Learning Distributed Representations of Code
    Alon, Uri
    Zilberstein, Meital
    Levy, Omer
    Yahav, Eran
    [J]. PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2019, 3 (POPL):
  • [5] [Anonymous], 2012, P NAACL HLT 2012 WOR
  • [6] Getafix: Learning to Fix Bugs Automatically
    Bader, Johannes
    Scott, Andrew
    Pradel, Michael
    Chandra, Satish
    [J]. PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2019, 3 (OOPSLA):
  • [7] The Plastic Surgery Hypothesis
    Barr, Earl T.
    Brun, Yuriy
    Devanbu, Premkumar
    Harman, Mark
    Sarro, Federica
    [J]. 22ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (FSE 2014), 2014, : 306 - 317
  • [8] Bavishi R, 2018, Arxiv, DOI arXiv:1809.05193
  • [9] Achieving Reliable Sentiment Analysis in the Software Engineering Domain using BERT
    Biswas, Eeshita
    Karabulut, Mehmet Efruz
    Pollock, Lori
    Vijay-Shanker, K.
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2020), 2020, : 162 - 173
  • [10] Bojanowski P., 2017, Trans. ACL, V5, P135, DOI [DOI 10.1162/TACLA00051, 10.1162/tacla00051, 10.1162/tacl_a_00051, DOI 10.1162/TACL_A_00051]