Replicating and Re-Evaluating the Theory of Relative Defect-Proneness

被引:12
作者
Syer, Mark D. [1 ]
Nagappan, Meiyappan [1 ]
Adams, Bram [2 ]
Hassan, Ahmed E. [1 ]
机构
[1] Queens Univ, Sch Comp, Kingston, ON, Canada
[2] Ecole Polytech, Genie Informat & Genie Logiciel, Montreal, PQ H3C 3A7, Canada
关键词
Survival analysis; Cox models; defect modelling; PREDICTION;
D O I
10.1109/TSE.2014.2361131
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A good understanding of the factors impacting defects in software systems is essential for software practitioners, because it helps them prioritize quality improvement efforts (e.g., testing and code reviews). Defect prediction models are typically built using classification or regression analysis on product and/or process metrics collected at a single point in time (e.g., a release date). However, current defect prediction models only predict if a defect will occur, but not when, which makes the prioritization of software quality improvements efforts difficult. To address this problem, Koru et al. applied survival analysis techniques to a large number of software systems to study how size (i.e., lines of code) influences the probability that a source code module (e.g., class or file) will experience a defect at any given time. Given that 1) the work of Koru et al. has been instrumental to our understanding of the size-defect relationship, 2) the use of survival analysis in the context of defect modelling has not been well studied and 3) replication studies are an important component of balanced scholarly debate, we present a replication study of the work by Koru et al. In particular, we present the details necessary to use survival analysis in the context of defect modelling (such details were missing from the original paper by Koru et al.). We also explore how differences between the traditional domains of survival analysis (i.e., medicine and epidemiology) and defect modelling impact our understanding of the size-defect relationship. Practitioners and researchers considering the use of survival analysis should be aware of the implications of our findings.
引用
收藏
页码:176 / 197
页数:22
相关论文
共 43 条
  • [1] Bird C., 2011, Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, P4, DOI DOI 10.1145/2025113.2025119
  • [2] A METRICS SUITE FOR OBJECT-ORIENTED DESIGN
    CHIDAMBER, SR
    KEMERER, CF
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1994, 20 (06) : 476 - 493
  • [3] Cohen J., 2003, APPL MULTIPLE REGRES, V3rd, DOI 10.4324/9780203774441
  • [4] PREDICTION AND CONTROL OF ADA SOFTWARE DEFECTS
    COMPTON, BT
    WITHROW, C
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 1990, 12 (03) : 199 - 207
  • [5] Tracking Concept Drift of Software Projects Using Defect Prediction Quality
    Ekanayake, Jayalath
    Tappolet, Jonas
    Gall, Harald C.
    Bernstein, Abraham
    [J]. 2009 6TH IEEE INTERNATIONAL WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES, 2009, : 51 - +
  • [6] The optimal class size for object-oriented software
    El Emam, K
    Benlarbi, S
    Goel, N
    Melo, W
    Lounis, H
    Rai, SN
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2002, 28 (05) : 494 - 509
  • [7] Leveraging legacy system dollars for e-business
    Erlikh, Len
    [J]. IT Professional, 2000, 2 (03) : 17 - 23
  • [8] Quantitative analysis of faults and failures in a complex software system
    Fenton, NE
    Ohlsson, N
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2000, 26 (08) : 797 - 814
  • [9] A critique of software defect prediction models
    Fenton, NE
    Neil, M
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1999, 25 (05) : 675 - 689
  • [10] A Systematic Literature Review on Fault Prediction Performance in Software Engineering
    Hall, Tracy
    Beecham, Sarah
    Bowes, David
    Gray, David
    Counsell, Steve
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2012, 38 (06) : 1276 - 1304