Investigating the impact of fault data completeness over time on predicting class fault-proneness

被引:7
作者
Al Dallal, Jehad [1 ]
Morasca, Sandro [2 ]
机构
[1] Kuwait Univ, Dept Informat Sci, POB 5969, Safat 13060, Kuwait
[2] Univ Insubria, Dept Theoret & Appl Sci, Via Mazzini 5, I-21100 Varese, Italy
关键词
Internal quality attributes; Quality measures; Class fault-proneness; Object-oriented software; Fault data; OPEN SOURCE SOFTWARE; EMPIRICAL VALIDATION; METRICS; COHESION; QUALITY;
D O I
10.1016/j.infsof.2017.11.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: The adequacy of fault-proneness prediction models in representing the relationship between the internal quality of classes and their fault-proneness relies on several factors. One of these factors is the completeness of the fault data. A fault-proneness prediction model that is built using fault data collected during testing or after a relatively short period of time after release may be inadequate and may not be reliable enough in predicting faulty classes. Objective: We empirically study the relationship between the time interval since a system is released and the performance of the fault-proneness prediction models constructed based on the fault data reported within the time interval. Method: We construct prediction models using fault data collected at several time intervals since a system has been released and study the performance of the models in representing the relationship between the internal quality of classes and their fault-proneness. In addition, we empirically explore the relationship between the performance of a prediction model and the percentage of increase in the number of classes detected faulty (PIF) over time. Results: Our results show evidence in favor of the expectation that predictions models that use more complete fault data, to a certain extent, more adequately represent the expected relationship between the internal quality of classes and their fault-proneness and have better performance. A threshold based on the PIF value can be used as an indicator for deciding when to stop collecting fault data. When reaching this threshold, collecting additional fault data will not significantly improve the prediction ability of the constructed model. Conclusion: When constructing fault-proneness prediction models, researchers and software engineers are advised to rely on systems that have relatively long maintenance histories. Researchers and software engineers can use the PIF value as an indicator for deciding when to stop collecting fault data.
引用
收藏
页码:86 / 105
页数:20
相关论文
共 66 条
[1]   Empirical analysis for investigating the effect of object-oriented metrics on fault proneness: A replicated case study [J].
Aggarwal, K.K. ;
Singh, Yogesh ;
Kaur, Arvinder ;
Malhotra, Ruchika .
Software Process Improvement and Practice, 2009, 14 (01) :39-62
[2]   Constructing models for predicting extract subclass refactoring opportunities using object-oriented quality metrics [J].
Al Dallal, Jehad .
INFORMATION AND SOFTWARE TECHNOLOGY, 2012, 54 (10) :1125-1141
[3]   A Precise Method-Method Interaction-Based Cohesion Metric for Object-Oriented Classes [J].
Al Dallal, Jehad ;
Briand, Lionel C. .
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2012, 21 (02)
[4]   Fault prediction and the discriminative powers of connectivity-based object-oriented class cohesion metrics [J].
Al Dallal, Jehad .
INFORMATION AND SOFTWARE TECHNOLOGY, 2012, 54 (04) :396-416
[5]   Improving the applicability of object-oriented class cohesion metrics [J].
Al Dallal, Jehad .
INFORMATION AND SOFTWARE TECHNOLOGY, 2011, 53 (09) :914-928
[6]   An object-oriented high-level design-based class cohesion metric [J].
Al Dallal, Jehad ;
Briand, Lionel C. .
INFORMATION AND SOFTWARE TECHNOLOGY, 2010, 52 (12) :1346-1361
[7]  
[Anonymous], 2011, ISO13314
[8]  
[Anonymous], 1999, Ph.D. Thesis
[9]  
[Anonymous], 912622003 ISOIEC
[10]  
[Anonymous], P 5 INT C PRED MOD S