An Empirical Investigation into Learning Bug-Fixing Patches in the Wild via Neural Machine Translation

被引:91
作者
Tufano, Michele [1 ]
Watson, Cody [1 ]
Bavota, Gabriele [2 ]
Di Penta, Massimiliano [3 ]
White, Martin [1 ]
Poshyvanyk, Denys [1 ]
机构
[1] Coll William & Mary, Williamsburg, VA 23185 USA
[2] Univ Svizzera Italiana USI, Lugano, Switzerland
[3] Univ Sannio, Benevento, Italy
来源
PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18) | 2018年
关键词
neural machine translation; bug-fixes; COMMIT;
D O I
10.1145/3238147.3240732
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Millions of open-source projects with numerous bug fixes are available in code repositories. This proliferation of software development histories can be leveraged to learn how to fix common programming bugs. To explore such a potential, we perform an empirical study to assess the feasibility of using Neural Machine Translation techniques for learning bug-fixing patches for real defects. We mine millions of bug-fixes from the change histories of GitHub repositories to extract meaningful examples of such bug-fixes. Then, we abstract the buggy and corresponding fixed code, and use them to train an Encoder-Decoder model able to translate buggy code into its fixed version. Our model is able to fix hundreds of unique buggy methods in the wild. Overall, this model is capable of predicting fixed patches generated by developers in 9% of the cases.
引用
收藏
页码:832 / 837
页数:6
相关论文
共 39 条
  • [1] What's a typical commit? A characterization of open source software repositories
    Alali, Abdulkareem
    Kagdi, Huzefa
    Maletic, Jonathan I.
    [J]. PROCEEDINGS OF THE 16TH IEEE INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, 2008, : 182 - 191
  • [2] Suggesting Accurate Method and Class Names
    Allamanis, Miltiadis
    Barr, Earl T.
    Bird, Christian
    Sutton, Charles
    [J]. 2015 10TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE 2015) PROCEEDINGS, 2015, : 38 - 49
  • [3] Bug Localization with Combination of Deep Learning and Information Retrieval
    An Ngoc Lam
    Anh Tuan Nguyen
    Hoan Anh Nguyen
    Nguyen, Tien N.
    [J]. 2017 IEEE/ACM 25TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC), 2017, : 218 - 229
  • [4] [Anonymous], 2013, P 2013 C EMPIRICAL M
  • [5] Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
  • [6] Britz D., 2017, P C EMP METH NAT LAN, DOI DOI 10.18653/V1/D17-1151
  • [7] The Care and Feeding of Wild-Caught Mutants
    Brown, David Bingham
    Vaughn, Michael
    Liblit, Ben
    Reps, Thomas
    [J]. ESEC/FSE 2017: PROCEEDINGS OF THE 2017 11TH JOINT MEETING ON FOUNDATIONS OF SOFTWARE ENGINEERING, 2017, : 511 - 522
  • [8] Cho K., 2014, ARXIV14061078, P1724, DOI 10.3115/V1/D14-1179
  • [9] Falleri J-R., 2014, P 29 ACM IEEE INT C, P313, DOI DOI 10.1145/2642937.2642982
  • [10] Populating a release history database from version control and bug tracking systems
    Fischer, M
    Pinzger, M
    Gall, H
    [J]. INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 2003, : 23 - 32