Duplicate Bug Report Detection: How Far Are We?

被引:6
|
作者
Zhang, Ting [1 ]
Han, Donggyun [2 ]
Vinayakarao, Venkatesh [3 ]
Irsan, Ivana Clairine [1 ]
Xu, Bowen [1 ]
Thung, Ferdian [1 ]
Lo, David [1 ]
Jiang, Lingxiao [1 ]
机构
[1] Singapore Management Univ, Singapore, Singapore
[2] Royal Holloway Univ London, London, England
[3] Chennai Math Inst, Chennai, Tamil Nadu, India
基金
新加坡国家研究基金会;
关键词
Bug reports; duplicate bug report detection; deep learning; empirical study; PERFORMANCE;
D O I
10.1145/3576042
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Many Duplicate Bug Report Detection (DBRD) techniques have been proposed in the research literature. The industry uses some other techniques. Unfortunately, there is insufficient comparison among them, and it is unclear how far we have been. This work fills this gap by comparing the aforementioned techniques. To compare them, we first need a benchmark that can estimate how a tool would perform if applied in a realistic setting today. Thus, we first investigated potential biases that affect the fair comparison of the accuracy of DBRD techniques. Our experiments suggest that data age and issue tracking system (ITS) choice cause a significant difference. Based on these findings, we prepared a new benchmark. We then used it to evaluate DBRD techniques to estimate better how far we have been. Surprisingly, a simpler technique outperforms recently proposed sophisticated techniques on most projects in our benchmark. In addition, we compared the DBRD techniques proposed in research with those used in Mozilla and VSCode. Surprisingly, we observe that a simple technique already adopted in practice can achieve comparable results as a recently proposed research tool. Our study gives reflections on the current state of DBRD, and we share our insights to benefit future DBRD research.
引用
收藏
页数:32
相关论文
共 50 条
  • [21] How to cherry pick the bug report for better summarization?
    Haoran Liu
    Yue Yu
    Shanshan Li
    Mingyang Geng
    Xiaoguang Mao
    Xiangke Liao
    Empirical Software Engineering, 2021, 26
  • [22] How to cherry pick the bug report for better summarization?
    Liu, Haoran
    Yu, Yue
    Li, Shanshan
    Geng, Mingyang
    Mao, Xiaoguang
    Liao, Xiangke
    EMPIRICAL SOFTWARE ENGINEERING, 2021, 26 (06)
  • [23] Moving Deep Learning into Web Browser: How Far Can We Go?
    Ma, Yun
    Xiang, Dongwei
    Zheng, Shuyu
    Tian, Deyu
    Liu, Xuanzhe
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 1234 - 1244
  • [24] Enhancing Bug Localization through Bug Report Summarization
    Zhang, Xia
    Zhu, Ziye
    Li, Yun
    23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023, 2023, : 1541 - 1546
  • [25] Machine Learning for Educational Metaverse: How Far Are We?
    Bilotti, Umberto
    Di Dario, Dario
    Palomba, Fabio
    Gravino, Carmine
    Sibilio, Maurizio
    2023 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, ICCE, 2023,
  • [26] Automatically Assessing Code Understandability: How Far Are We?
    Scalabrino, Simone
    Bavota, Gabriele
    Vendome, Christopher
    Linares-Vasquez, Mario
    Poshyvanyk, Denys
    Oliveto, Rocco
    PROCEEDINGS OF THE 2017 32ND IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE'17), 2017, : 417 - 427
  • [27] Revisiting, Benchmarking and Exploring API Recommendation: How Far Are We?
    Peng, Yun
    Li, Shuqing
    Gu, Wenwei
    Li, Yichen
    Wang, Wenxuan
    Gao, Cuiyun
    Lyu, Michael R.
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (04) : 1876 - 1897
  • [28] Web Browser as a Uniform Application Platform: How Far Are We?
    Nyrhinen, Feetu
    Mikkonen, Tommi
    2009 35TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS, PROCEEDINGS, 2009, : 578 - 584
  • [29] Modelling the ‘hurried’ bug report reading process to summarize bug reports
    Rafael Lotufo
    Zeeshan Malik
    Krzysztof Czarnecki
    Empirical Software Engineering, 2015, 20 : 516 - 548
  • [30] Modelling the 'hurried' bug report reading process to summarize bug reports
    Lotufo, Rafael
    Malik, Zeeshan
    Czarnecki, Krzysztof
    EMPIRICAL SOFTWARE ENGINEERING, 2015, 20 (02) : 516 - 548