Getafix: Learning to Fix Bugs Automatically

被引:148
作者
Bader, Johannes [1 ]
Scott, Andrew [1 ]
Pradel, Michael [1 ]
Chandra, Satish [1 ]
机构
[1] Facebook, Menlo Pk, CA 94025 USA
来源
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL | 2019年 / 3卷 / OOPSLA期
关键词
Automated program repair; Patch generation; Code transform;
D O I
10.1145/3360585
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Static analyzers help find bugs early by warning about recurring bug categories. While fixing these bugs still remains a mostly manual task in practice, we observe that fixes for a specific bug category often are repetitive. This paper addresses the problem of automatically fixing instances of common bugs by learning from past fixes. We present Getafix, an approach that produces human-like fixes while being fast enough to suggest fixes in time proportional to the amount of time needed to obtain static analysis results in the first place. Getafix is based on a novel hierarchical clustering algorithm that summarizes fix patterns into a hierarchy ranging from general to specific patterns. Instead of an expensive exploration of a potentially large space of candidate fixes, Getafix uses a simple yet effective ranking technique that uses the context of a code change to select the most appropriate fix for a given bug. Our evaluation applies Getafix to 1,268 bug fixes for six bug categories reported by popular static analyzers for Java, including null dereferences, incorrect API calls, and misuses of particular language constructs. The approach predicts exactly the human-written fix as the top-most suggestion between 12% and 91% of the time, depending on the bug category. The top-5 suggestions contain fixes for 526 of the 1,268 bugs. Moreover, we report on deploying the approach within Facebook, where it contributes to the reliability of software used by billions of people. To the best of our knowledge, Getafix is the first industrially-deployed automated bug-fixing tool that learns fix patterns from past, human-written fixes to produce human-like fixes.
引用
收藏
页数:27
相关论文
共 31 条
  • [1] Aftandilian E., 2012, 2012 12th IEEE Working Conference on Source Code Analysis and Manipulation (SCAM 2012), P14, DOI 10.1109/SCAM.2012.28
  • [2] A Survey of Machine Learning for Big Code and Naturalness
    Allamanis, Miltiadis
    Barr, Earl T.
    Devanbu, Premkumar
    Sutton, Charles
    [J]. ACM COMPUTING SURVEYS, 2018, 51 (04)
  • [3] [Anonymous], 2017, ABS171011054 CORR
  • [4] Benzecri Jean-Paul, 1982, Cahiers de l'analyse des donnees, V7, P209
  • [5] The Care and Feeding of Wild-Caught Mutants
    Brown, David Bingham
    Vaughn, Michael
    Liblit, Ben
    Reps, Thomas
    [J]. ESEC/FSE 2017: PROCEEDINGS OF THE 2017 11TH JOINT MEETING ON FOUNDATIONS OF SOFTWARE ENGINEERING, 2017, : 511 - 522
  • [6] Moving Fast with Software Verification
    Calcagno, Cristiano
    Distefano, Dino
    Dubreil, Jeremy
    Gabi, Dominik
    Hooimeijer, Pieter
    Luca, Martino
    O'Hearn, Peter
    Papakonstantinou, Irene
    Purbrick, Jim
    Rodriguez, Dulma
    [J]. NASA FORMAL METHODS (NFM 2015), 2015, 9058 : 3 - 11
  • [7] What Developers Want and Need from Program Analysis: An Empirical Study
    Christakis, Maria
    Bird, Christian
    [J]. 2016 31ST IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2016, : 332 - 343
  • [8] Cornu B., 2015, ARXIV PREPRINT ARXIV
  • [9] Falleri J., 2014, P 29 INT C AUTOMATED, P313, DOI DOI 10.1145/2642937.2642982
  • [10] Gupta R, 2017, AAAI CONF ARTIF INTE, P1345