HYDRA: Massively Compositional Model for Cross-Project Defect Prediction

被引：222

作者：

Xia, Xin ^{[1
]}

Lo, David ^{[2
]}

Pan, Sinno Jialin ^{[3
]}

Nagappan, Nachiappan ^{[4
]}

Wang, Xinyu ^{[1
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310000, Zhejiang, Peoples R China

[2] Singapore Management Univ, Sch Informat Syst, Singapore 17890, Singapore

[3] Nanyang Technol Univ, Sch Comp Engn, Singapore, Singapore

[4] Microsoft Res, Testing Verificat & Measurement Res, Redmond, WA 98052 USA

来源：

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING | 2016年 / 42卷 / 10期

关键词：

Cross-project defect prediction; transfer learning; genetic algorithm; ensemble learning;

D O I：

10.1109/TSE.2016.2543218

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Most software defect prediction approaches are trained and applied on data from the same project. However, often a new project does not have enough training data. Cross-project defect prediction, which uses data from other projects to predict defects in a particular project, provides a new perspective to defect prediction. In this work, we propose a HYbrid moDel Reconstruction Approach (HYDRA) for cross-project defect prediction, which includes two phases: genetic algorithm (GA) phase and ensemble learning (EL) phase. These two phases create a massive composition of classifiers. To examine the benefits of HYDRA, we perform experiments on 29 datasets from the PROMISE repository which contains a total of 11,196 instances (i.e., Java classes) labeled as defective or clean. We experiment with logistic regression as the underlying classification algorithm of HYDRA. We compare our approach with the most recently proposed cross-project defect prediction approaches: TCA+ by Nam et al., Peters filter by Peters et al., GP by Liu et al., MO by Canfora et al., and CODEP by Panichella et al. Our results show that HYDRA achieves an average F1-score of 0.544. On average, across the 29 datasets, these results correspond to an improvement in the F1-scores of 26.22, 34.99, 47.43, 28.61, and 30.14 percent over TCA+, Peters filter, GP, MO, and CODEP, respectively. In addition, HYDRA on average can discover 33 percent of all bugs if developers inspect the top 20 percent lines of code, which improves the best baseline approach (TCA+) by 44.41 percent. We also find that HYDRA improves the F1-score of Zero-R which predict all the instances to be defective by 5.42 percent, but improves Zero-R by 58.65 percent when inspecting the top 20 percent lines of code. In practice, Zero-R can be hard to use since it simply predicts all of the instances to be defective, and thus developers have to inspect all of the instances to find the defective ones. Moreover, we notice the improvement of HYDRA over other baseline approaches in terms of F1-score and when inspecting the top 20 percent lines of code are substantial, and in most cases the improvements are significant and have large effect sizes across the 29 datasets.

引用

页码：977 / 998

页数：22

共 57 条

[1]

Abdi H., 2007, Encyclopedia of Measurement and Statistics, V1, P530, DOI DOI 10.4135/9781412952644.N299

[2]

[Anonymous], 1994, ANAL DEPENDENCIES

[3]

[Anonymous], 1997, Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), Nashville, Tennessee, USA, July 8-12, 1997

[4]

[Anonymous], 2007, ACL

[5]

[Anonymous], 2008, Proceedings of the 4th international workshop on Predictor models in software engineering

[6]

Anvik J., 2005, P 2005 OOPSLA WORKSH, P35, DOI [10.1145/1117696, 10.1145/1117696.1117704, DOI 10.1145/1117696]

[7] A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering [J].

Arcuri, Andrea ;

Briand, Lionel .

2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2011, :1-10

[8] Data mining techniques for building fault-proneness models in telecom Java']Java softwarea [J].

Arisholm, Erik ;

Biland, Lionel C. ;

Fuglerud, Magnus .

ISSRE 2007: 18TH IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, PROCEEDINGS, 2007, :215-+

[9] A hierarchical model for object-oriented design quality assessment [J].

Bansiya, J ;

Davis, CG .

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2002, 28 (01) :4-17

[10]

Bettenburg N., 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR 2012), P60, DOI 10.1109/MSR.2012.6224300

← 1 2 3 4 5 6 →