An industrial case study of classifier ensembles for locating software defects

被引:45
作者
Misirli, Ayse Tosun [1 ]
Bener, Ayse Basar [2 ]
Turhan, Burak [3 ]
机构
[1] Bogazici Univ, Dept Comp Engn, TR-34342 Istanbul, Turkey
[2] Ryerson Univ, Ted Rogers Sch Informat Technol Management, Toronto, ON M5B 2K3, Canada
[3] Univ Oulu, Dept Informat Proc Sci, Oulu, Finland
关键词
Defect prediction; Ensemble of classifiers; Static code attributes; Embedded software; STATIC CODE ATTRIBUTES; QUALITY ESTIMATION; DESIGN; FAULTS;
D O I
10.1007/s11219-010-9128-1
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
As the application layer in embedded systems dominates over the hardware, ensuring software quality becomes a real challenge. Software testing is the most time-consuming and costly project phase, specifically in the embedded software domain. Misclassifying a safe code as defective increases the cost of projects, and hence leads to low margins. In this research, we present a defect prediction model based on an ensemble of classifiers. We have collaborated with an industrial partner from the embedded systems domain. We use our generic defect prediction models with data coming from embedded projects. The embedded systems domain is similar to mission critical software so that the goal is to catch as many defects as possible. Therefore, the expectation from a predictor is to get very high probability of detection (pd). On the other hand, most embedded systems in practice are commercial products, and companies would like to lower their costs to remain competitive in their market by keeping their false alarm (pf) rates as low as possible and improving their precision rates. In our experiments, we used data collected from our industry partners as well as publicly available data. Our results reveal that ensemble of classifiers significantly decreases pf down to 15% while increasing precision by 43% and hence, keeping balance rates at 74%. The cost-benefit analysis of the proposed model shows that it is enough to inspect 23% of the code on local datasets to detect around 70% of defects.
引用
收藏
页码:515 / 536
页数:22
相关论文
共 56 条
[1]   Constructing a Bayesian Belief Network to predict final quality in embedded system development [J].
Amasaki, S ;
Takagi, Y ;
Mizuno, O ;
Kikuno, T .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (06) :1134-1141
[2]  
[Anonymous], 2002, ADV COMPUTERS
[3]  
[Anonymous], ANAL DEFECT PREDICTI
[4]  
[Anonymous], 2004, Introduction to Machine Learning
[5]  
[Anonymous], 2004, COMBINING PATTERN CL, DOI DOI 10.1002/0471660264
[6]  
[Anonymous], ANAL DEFECT PREDICTI
[7]  
[Anonymous], METRICS 02
[8]  
[Anonymous], P 15 INT C ADV NEUR
[9]  
[Anonymous], 61012 ANSIIEEE
[10]  
[Anonymous], ISCIS 07 P 22 INT S