Benchmarking and Categorizing the Performance of Neural Program Repair Systems for Java']Java

被引：0

作者：

Zhong, Wenkang ^{[1
]}

Li, Chuanyi ^{[1
]}

Liu, Kui ^{[2
]}

Ge, Jidong ^{[1
]}

Luo, Bin ^{[1
]}

Bissyande, TEGAWENDe F. ^{[3
]}

Ng, Vincent ^{[4
]}

机构：

[1] Nanjing Univ, State Key Lab Novel Software & Technol, Nanjing, Peoples R China

[2] Huawei Software Engn Applicat Technol Lab, Hangzhou, Peoples R China

[3] Univ Luxembourg, Luxembourg, Luxembourg

[4] Univ Texas Dallas, Richardson, TX USA

来源：

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY | 2024年 / 34卷 / 01期

基金：

欧洲研究理事会; 中国国家自然科学基金;

关键词：

datasets; program repair; benchmark; empirical study;

D O I：

10.1145/3688834

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Recent years have seen a rise in Neural Program Repair (NPR) systems in the software engineering community, which adopt advanced deep learning techniques to automatically fix bugs. Having a comprehensive understanding of existing systems can facilitate new improvements in this area and provide practical instructions for users. However, we observe two potential weaknesses in the current evaluation of NPR systems: (1) published systems are trained with varying data, and (2) NPR systems are roughly evaluated through the number of totally fixed bugs. Questions such as what types of bugs are repairable for current systems cannot be answered yet. Consequently, researchers cannot make target improvements in this area and users have no idea of the real affair of existing systems. In this article, we perform a systematic evaluation of the existing nine state-of-the-art NPR systems. To perform a fair and detailed comparison, we (1) build a new benchmark and framework that supports training and validating the nine systems with unified data and (2) evaluate re-trained systems with detailed performance analysis, especially on the effectiveness and the efficiency. We believe our benchmark tool and evaluation results could offer practitioners the real affairs of current NPR systems and the implications of further facilitating the improvements of NPR.

引用

页数：35

共 26 条

[1] On the Efficiency of Test Suite based Program Repair A Systematic Assessment of 16 Automated Repair Systems for Java']Java Programs
Liu, Kui
Wang, Shangwen
Koyuncu, Anil
Kim, Kisub
Bissyande, Tegawende F.
Kim, Dongsun
Wu, Peng
Klein, Jacques
Mao, Xiaoguang
Le Traon, Yves
2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 615 - 627
[2] Applying CodeBERT for Automated Program Repair of Java']Java Simple Bugs
Mashhadi, Ehsan
Hemmati, Hadi
2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021), 2021, : 505 - 509
[3] The DaCapo benchmarks: Java']Java benchmarking development and analysis
Blackburn, Stephen M.
Garner, Robin
Hoffmann, Chris
Khan, Asjad M.
McKinley, Kathryn S.
Bentzur, Rotem
Diwan, Amer
Feinberg, Daniel
Frampton, Daniel
Guyer, Samuel Z.
Hirzel, Martin
Hosking, Antony
Jump, Maria
Lee, Han
Moss, J. Eliot B.
Phansalkar, Aashish
Stefanovic, Darko
VanDrunen, Thomas
von Dincklage, Daniel
Wiedermann, Ben
ACM SIGPLAN NOTICES, 2006, 41 (10) : 169 - 190
[4] Measurement Analysis When Benchmarking Java']Java Card Platforms
Paradinas, Pierre
Cordry, Julien
Bouzefrane, Samia
INFORMATION SECURITY THEORY AND PRACTICE: SMART DEVICES, PERVASIVE SYSTEMS, AND UBIQUITOUS NETWORKS, PROCEEDINGS, 2009, 5746 : 84 - +
[5] Evaluation of Java']Java Card performance
Bouzefrane, Samia
Cordry, Julien
Meunier, Herve
Paradinas, Pierre
SMART CARD RESEARCH AND ADVANCED APPLICATIONS, PROCEEDINGS, 2008, 5189 : 228 - +
[6] Benchmarking Java']Java Application Using JNI and Native C Application on Android
Kim, Yeong-Jun
Cho, Seong-Jin
Kim, Kil-Jae
Hwang, Eun-Hye
Yoon, Seung-Hyun
Jeon, Jae-Wook
2012 12TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2012, : 284 - 288
[7] Networking performance for distributed objects in Java']Java
Migliardi, M
Sunderam, V
PDPTA'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, 2001, : 1157 - 1163
[8] Performance evaluation of Java']Java card bytecodes
Paradinas, Pierre
Cordry, Julien
Bouzefrane, Samia
INFORMATION SECURITY THEORY AND PRACTICES: SMART CARDS, MOBILE AND UBIQUITOUS COMPUTING SYSTEMS, PROCEEDINGS, 2007, 4462 : 127 - +
[9] Vul4J: A Dataset of Reproducible Java']Java Vulnerabilities Geared Towards the Study of Program Repair Techniques
Bui, Quang-Cuong
Scandariato, Riccardo
Ferreyra, Nicolas E. Diaz
2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022), 2022, : 464 - 468
[10] Automated Repair of Java']Java Programs with Random Search via Code Similarity
Cao, Heling
Liu, Fangzheng
Shi, Jianshu
Chu, Yonghe
Deng, Miaolei
2021 21ST INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C 2021), 2021, : 470 - 477

← 1 2 3 →