Is the Ground Truth Really Accurate? Dataset Purification for Automated Program Repair

被引:7
|
作者
Yang, Deheng [1 ]
Lei, Yan [2 ]
Mao, Xiaoguang [1 ]
Lo, David [3 ]
Xie, Huan [2 ]
Yan, Meng [2 ]
机构
[1] Natl Univ Def Technol, Changsha, Peoples R China
[2] Chongqing Univ, Chongqing, Peoples R China
[3] Singapore Management Univ, Singapore, Singapore
来源
2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2021) | 2021年
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
bug dataset; automated program repair; dataset purification; CODE; GENERATION;
D O I
10.1109/SANER50967.2021.00018
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Datasets of real-world bugs shipped with human-written patches are intensively used in the evaluation of existing automated program repair (APR) techniques, wherein the human-written patches always serve as the ground truth, for manual or automated assessment approaches, to evaluate the correctness of test-suite adequate patches. An inaccurate human-written patch tangled with other code changes will pose threats to the reliability of the assessment results. Therefore, the construction of such datasets always requires much manual effort on isolating real bug fixes from bug fixing commits. However, the manual work is time-consuming and prone to mistakes, and little has been known on whether the ground truth in such datasets is really accurate. In this paper, we propose DEPTEST, an automated DatasEt Purification technique from the perspective of triggering Tests. Leveraging coverage analysis and delta debugging, DEPTEST can automatically identify and filter out the code changes irrelevant to the bug exposed by triggering tests. To measure the strength of DEPTEST, we run it on the most extensively used dataset (i.e., Defects4J) that claims to already exclude all irrelevant code changes for each bug fix via manual purification. Our experiment indicates that even in a dataset where the bug fix is claimed to be well isolated, 41.01% of human-written patches can be further reduced by 4.3 lines on average, with the largest reduction reaching up to 53 lines. This indicates its great potential in assisting in the construction of datasets of accurate bug fixes. Furthermore, based on the purified patches, we re-dissect Defects4J and systematically revisit the APR of multi-chunk bugs to provide insights for future research targeting such bugs.
引用
收藏
页码:96 / 107
页数:12
相关论文
共 50 条
  • [1] BUGSPHP: A dataset for Automated Program Repair in PHP
    Pramod, K. D.
    De Silva, W. T. N.
    Thabrew, W. U. K.
    Shariffdeen, Ridwan
    Wickramanayake, Sandareka
    2024 IEEE/ACM 21ST INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2024, : 128 - 132
  • [2] Automatic dataset generation for automated program repair of bugs and vulnerabilities through SonarQube
    del-Hoyo-Gabaldon, Jesus -Angel
    Moreno-Cediel, Antonio
    Garcia-Lopez, Eva
    Garcia-Cabot, Antonio
    de-Fitero-Dominguez, David
    SOFTWAREX, 2024, 26
  • [3] The Impact of Program Reduction on Automated Program Repair
    Vidziunas, Linas
    Binkley, David
    Moonen, Leon
    2024 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION, ICSME 2024, 2024, : 337 - 349
  • [4] TBar: Revisiting Template-Based Automated Program Repair
    Liu, Kui
    Koyuncu, Anil
    Kim, Dongsun
    Bissyande, Tegawende F.
    PROCEEDINGS OF THE 28TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS (ISSTA '19), 2019, : 31 - 42
  • [5] Quality of Automated Program Repair on Real-World Defects
    Motwani, Manish
    Soto, Mauricio
    Brun, Yuriy
    Just, Rene
    Le Goues, Claire
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (02) : 637 - 661
  • [6] Boosting Automated Program Repair with Bug-Inducing Commits
    Wen, Ming
    Liu, Yepang
    Cheung, Shing-Chi
    2020 IEEE/ACM 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: NEW IDEAS AND EMERGING RESULTS (ICSE-NIER 2020), 2020, : 77 - 80
  • [7] E-APR: Mapping the effectiveness of automated program repair techniques
    Aleti, Aldeida
    Martinez, Matias
    EMPIRICAL SOFTWARE ENGINEERING, 2021, 26 (05)
  • [8] Automated Program Repair for Introductory Programming Assignments
    Wan, Han
    Luo, Hongzhen
    Li, Mengying
    Luo, Xiaoyan
    IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2024, 17 : 1745 - 1760
  • [9] The Impact of Search Algorithms in Automated Program Repair
    Assiri, Fatmah Yousef
    Bieman, James M.
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND SOFTWARE ENGINEERING (SCSE'15), 2015, 62 : 65 - 72
  • [10] Static Automated Program Repair for Heap Properties
    van Tonder, Rijnard
    Le Goues, Claire
    PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2018, : 151 - 162