A data-driven approach for understanding invalid bug reports: An industrial case study

被引:0
作者
Laiq, Muhammad [1 ]
bin Ali, Nauman [1 ]
Borstler, Jurgen [1 ]
Engstrom, Emelie [2 ]
机构
[1] Blekinge Inst Technol, Dept Software Engn, SE-37179 Karlskrona, Sweden
[2] Lund Univ, Dept Software Engn, SE-22100 Lund, Sweden
关键词
Software maintenance; Invalid bug reports; Bug management; Topic modeling; LDA; Bug classification; Software analytics; SEVERITY; MODEL;
D O I
10.1016/j.infsof.2023.107305
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Bug reports created during software development and maintenance do not always describe deviations from a system's valid behavior. Such invalid bug reports may consume significant resources and adversely affect the prioritization and resolution of valid bug reports. There is a need to identify preventive actions to reduce the inflow of invalid bug reports. Existing research has shown that manually analyzing invalid bug report descriptions provides cues regarding preventive actions. However, such a manual approach is not cost-effective due to the time required to analyze a sufficiently large number of bug reports needed to identify useful patterns. Furthermore, the analysis needs to be repeated as the underlying causes of invalid bug reports change over time.Objective: In this study, we propose and evaluate the use of Latent Dirichlet Allocation (LDA), a topic modeling approach, to support practitioners in suggesting preventive actions to avoid the creation of similar invalid bug reports in the future. Method: In an industrial case study, we first manually analyzed descriptions of invalid bug reports to identify common patterns in their descriptions. We further investigated to what extent LDA can support this manual process. We used expert-based validation to evaluate the relevance of identified common patterns and their usefulness in suggesting preventive measures. Results: We found that invalid bug reports have common patterns that are perceived as relevant, and they can be used to devise preventive measures. Furthermore, the identification of common patterns can be supported with automation.Conclusion: Using LDA, practitioners can effectively identify representative groups of bug reports (i.e., relevant common patterns) from a large number of bug reports and analyze them further to devise preventive measures.
引用
收藏
页数:12
相关论文
共 57 条
  • [1] Akilan T, 2020, IEEE SYS MAN CYBERN, P1622, DOI [10.1109/SMC42975.2020.9283289, 10.1109/smc42975.2020.9283289]
  • [2] Nguyen AT, 2012, IEEE INT CONF AUTOM, P70, DOI 10.1145/2351676.2351687
  • [3] [Anonymous], 2011, Proceedings of the 2011 International Symposium on Software Testing and Analysis, DOI DOI 10.1145/2001420.2001445
  • [4] Bachmann A, 2009, IWPSE-EVOL 09: ERCIM WORKSHOP ON SOFTWARE EVOLUTION (EVOL) AND INTERNATIONAL WORKSHOP ON PRINCIPLES OF SOFTWARE EVOLUTION (IWPSE), P119
  • [5] Methodbook: Recommending Move Method Refactorings via Relational Topic Models
    Bavota, Gabriele
    Oliveto, Rocco
    Gethers, Malcom
    Poshyvanyk, Denys
    De Lucia, Andrea
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2014, 40 (07) : 671 - 694
  • [6] Combining lexical and structural information to reconstruct software layers
    Belle, Alvine Boaye
    El Boussaidi, Ghizlane
    Kpodjedo, Segla
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2016, 74 : 1 - 16
  • [7] Bettenburg N., 2008, P 16 ACM SIGSOFT INT
  • [8] Bibyan Ritu, 2022, Proceedings of Data Analytics and Management: ICDAM 2021. Lecture Notes on Data Engineering and Communications Technologies (90), P363, DOI 10.1007/978-981-16-6289-8_31
  • [9] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [10] How changes affect software entropy: an empirical study
    Canfora, Gerardo
    Cerulo, Luigi
    Cimitile, Marta
    Di Penta, Massimiliano
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2014, 19 (01) : 1 - 38