Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction Models

被引：303

作者：

Ghotra, Baljinder ^{[1
]}

McIntosh, Shane ^{[1
]}

Hassan, Ahmed E. ^{[1
]}

机构：

[1] Queens Univ, Sch Comp, SAIL, Kingston, ON K7L 3N6, Canada

来源：

2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 1 | 2015年

关键词：

SOFTWARE; METRICS; VALIDATION; COMPLEXITY; FAULTS;

D O I：

10.1109/ICSE.2015.91

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Defect prediction models help software quality assurance teams to effectively allocate their limited resources to the most defect-prone software modules. A variety of classification techniques have been used to build defect prediction models ranging from simple (e.g., logistic regression) to advanced techniques (e.g., Multivariate Adaptive Regression Splines (MARS)). Surprisingly, recent research on the NASA dataset suggests that the performance of a defect prediction model is not significantly impacted by the classification technique that is used to train it. However, the dataset that is used in the prior study is both: (a) noisy, i.e., contains erroneous entries and (b) biased, i.e., only contains software developed in one setting. Hence, we set out to replicate this prior study in two experimental settings. First, we apply the replicated procedure to the same (known-to-be noisy) NASA dataset, where we derive similar results to the prior study, i.e., the impact that classification techniques have appear to be minimal. Next, we apply the replicated procedure to two new datasets: (a) the cleaned version of the NASA dataset and (b) the PROMISE dataset, which contains open source software developed in a variety of settings (e.g., Apache, GNU). The results in these new datasets show a clear, statistically distinct separation of groups of techniques, i.e., the choice of classification technique has an impact on the performance of defect prediction models. Indeed, contrary to earlier research, our results suggest that some classification techniques tend to produce defect prediction models that outperform others.

引用

页码：789 / 800

页数：12

共 65 条

[1]

[Anonymous], 2009, J. Theor. Appl. Inf. Technol.

[2]

[Anonymous], 2005, DATA MINING

[3]

[Anonymous], 2000, A comparative study of data clustering techniques

[4]

[Anonymous], THESIS

[5]

Arisholm E., 2006, P 2006 ACMIEEE INT S, P8, DOI 10.1145/1159733.1159738

[6] A validation of object-oriented design metrics as quality indicators [J].

Basili, VR ;

Briand, LC ;

Melo, WL .

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1996, 22 (10) :751-761

[7]

Bernstein A., 2007, 9 INT WORKSHOP PRINC, P11, DOI DOI 10.1145/1294948.1294953

[8]

Berson A., 2004, Building Data Mining Application for CRM

[9]

Bettenburg N., 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR 2012), P60, DOI 10.1109/MSR.2012.6224300

[10] Bagging predictors [J].

Breiman, L .

MACHINE LEARNING, 1996, 24 (02) :123-140

← 1 2 3 4 5 6 7 →