Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization

被引:51
作者
Lee, Jaekwon [1 ]
Kim, Dongsun [1 ]
Bissyande, Tegawende F. [1 ]
Jung, Woosung [2 ]
Le Traon, Yves [1 ]
机构
[1] Univ Luxembourg, SnT, Luxembourg, Luxembourg
[2] Seoul Natl Univ Educ, Seoul, South Korea
来源
ISSTA'18: PROCEEDINGS OF THE 27TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS | 2018年
基金
新加坡国家研究基金会;
关键词
Reproducibility studies; bug localization; information retrieval; INFORMATION-RETRIEVAL; FEATURE LOCATION; EXECUTION; RANKING;
D O I
10.1145/3213846.3213856
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In recent years, the use of Information Retrieval (IR) techniques to automate the localization of buggy files, given a bug report, has shown promising results. The abundance of approaches in the literature, however, contrasts with the reality of IR-based bug localization (IRBL) adoption by developers (or even by the research community to complement other research approaches). Presumably, this situation is due to the lack of comprehensive evaluations for state-of-the-art approaches which offer insights into the actual performance of the techniques. We report on a comprehensive reproduction study of six state-of-the-art IRBL techniques. This study applies not only subjects used in existing studies (old subjects) but also 46 new subjects (61,431 Java files and 9,459 bug reports) to the IRBL techniques. In addition, the study compares two different version matching (between bug reports and source code files) strategies to highlight our observations related to performance deterioration. We also vary test file inclusion to investigate the effectiveness of IRBL techniques on test files, or its noise impact on performance. Finally, we assess potential performance gain if duplicate bug reports are leveraged.
引用
收藏
页码:61 / 72
页数:12
相关论文
共 54 条
  • [1] On the accuracy of spectrum-based fault localization
    Abreu, Rui
    Zoeteweij, Peter
    van Gemund, Arjan J. C.
    [J]. TAIC PART 2007 - TESTING: ACADEMIC AND INDUSTRIAL CONFERENCE - PRACTICE AND RESEARCH TECHNIQUES, PROCEEDINGS: CO-LOCATED WITH MUTATION 2007, 2007, : 89 - +
  • [2] Abreu R, 2006, 12TH PACIFIC RIM INTERNATIONAL SYMPOSIUM ON DEPENDABLE COMPUTING, PROCEEDINGS, P39
  • [3] Anh Tuan Nguyen, 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering, P263, DOI 10.1109/ASE.2011.6100062
  • [4] [Anonymous], 2002, RTI PROJECT
  • [5] [Anonymous], 2008, Introduction to information retrieval
  • [6] [Anonymous], 1992, Information retrieval: Data structures and algorithms
  • [7] [Anonymous], 2004, P INT C INT AN
  • [8] Arong, 2014, PROCEEDINGS OF 2014 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), P51, DOI 10.1109/PIC.2014.6972294
  • [9] Duplicate Bug Reports Considered Harmful ... Really?
    Bettenburg, Nicolas
    Premraj, Rahul
    Zimmermann, Thomas
    Kim, Sunghun
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, 2008, : 337 - 345
  • [10] Bissyandé TF, 2013, PROC INT SYMP SOFTW, P188, DOI 10.1109/ISSRE.2013.6698918