An Empirical Investigation into Learning Bug-Fixing Patches in the Wild via Neural Machine Translation

被引：91

作者：

Tufano, Michele ^{[1
]}

Watson, Cody ^{[1
]}

Bavota, Gabriele ^{[2
]}

Di Penta, Massimiliano ^{[3
]}

White, Martin ^{[1
]}

Poshyvanyk, Denys ^{[1
]}

机构：

[1] Coll William & Mary, Williamsburg, VA 23185 USA

[2] Univ Svizzera Italiana USI, Lugano, Switzerland

[3] Univ Sannio, Benevento, Italy

来源：

PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18) | 2018年

关键词：

neural machine translation; bug-fixes; COMMIT;

D O I：

10.1145/3238147.3240732

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Millions of open-source projects with numerous bug fixes are available in code repositories. This proliferation of software development histories can be leveraged to learn how to fix common programming bugs. To explore such a potential, we perform an empirical study to assess the feasibility of using Neural Machine Translation techniques for learning bug-fixing patches for real defects. We mine millions of bug-fixes from the change histories of GitHub repositories to extract meaningful examples of such bug-fixes. Then, we abstract the buggy and corresponding fixed code, and use them to train an Encoder-Decoder model able to translate buggy code into its fixed version. Our model is able to fix hundreds of unique buggy methods in the wild. Overall, this model is capable of predicting fixed patches generated by developers in 9% of the cases.

引用

页码：832 / 837

页数：6

共 39 条

[21] The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs
Le Goues, Claire
Holtschulte, Neal
Smith, Edward K.
Brun, Yuriy
Devanbu, Premkumar
Forrest, Stephanie
Weimer, Westley
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2015, 41 (12) : 1236 - 1256
[22] Le Goues C, 2012, PROC INT CONF SOFTW, P3, DOI 10.1109/ICSE.2012.6227211
[23] History Driven Program Repair
Le, Xuan-Bach D.
Lo, David
Le Goues, Claire
[J]. 2016 IEEE 23RD INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION, AND REENGINEERING (SANER), VOL 1, 2016, : 213 - 224
[24] Automatic Patch Generation by Learning Correct Code
Long, Fan
Rinard, Martin
[J]. ACM SIGPLAN NOTICES, 2016, 51 (01) : 298 - 312
[25] Luong T., 2015, Effective approaches to attentionbased neural machine translation, P1412
[26] Automatic repair of real bugs in java']java: a large-scale experiment on the defects4j dataset
Martinez, Matias
Durieux, Thomas
Sommerard, Romain
Xuan, Jifeng
Monperrus, Martin
[J]. EMPIRICAL SOFTWARE ENGINEERING, 2017, 22 (04) : 1936 - 1964
[27] Parr Terence, 2013, The definitive ANTLR 4 reference, V2nd
[28] Raychev V, 2014, ACM SIGPLAN NOTICES, V49, P419, DOI [10.1145/2666356.2594321, 10.1145/2594291.2594321]
[29] Robert C.seacord., 2003, MODERNIZING LEGACY S
[30] Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair
Smith, Edward K.
Barr, Earl T.
Le Goues, Claire
Brun, Yuriy
[J]. 2015 10TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE 2015) PROCEEDINGS, 2015, : 532 - 543

← 1 2 3 4 →