Unsupervised Deep Domain Adaptation for Heterogeneous Defect Prediction

被引:10
作者
Gong, Lina [1 ,2 ]
Jiang, Shujuan [1 ,3 ]
Yu, Qiao [4 ]
Jiang, Li [1 ,3 ]
机构
[1] China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou, Jiangsu, Peoples R China
[2] Zaozhuang Univ, Dept Informat Sci & Engn, Zaozhuang, Peoples R China
[3] Minist Educ, Engn Res Ctr Mine Digitalizat, Xuzhou, Jiangsu, Peoples R China
[4] Jiangsu Normal Univ, Sch Comp Sci & Technol, Xuzhou, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
heterogeneous defect prediction; neural networks; maximum mean discrepancy; class-imbalance;
D O I
10.1587/transinf.2018EDP7289
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Heterogeneous defect prediction (HDP) is to detect the largest number of defective software modules in one project by using historical data collected from other projects with different metrics. However, these data can not be directly used because of different metrics set among projects. Meanwhile, software data have more non-defective instances than defective instances which may cause a significant bias towards defective instances. To completely solve these two restrictions, we propose unsupervised deep domain adaptation approach to build a HDP model. Specifically, we firstly map the data of source and target projects into a unified metric representation (UMR). Then, we design a simple neural network (SNN) model to deal with the heterogeneous and class-imbalanced problems in software defect prediction (SDP). In particular, our model introduces the Maximum Mean Discrepancy (MMD) as the distance between the source and target data to reduce the distribution mismatch, and use the cross-entropy loss function as the classification loss. Extensive experiments on 18 public projects from four datasets indicate that the proposed approach can build an effective prediction model for heterogeneous defect prediction (HDP) and outperforms the related competing approaches.
引用
收藏
页码:537 / 549
页数:13
相关论文
共 47 条
[41]   System Testing of Timing Requirements based on Use Cases and Timed Automata [J].
Wang, Chunhui ;
Pastore, Fabrizio ;
Briand, Lionel .
2017 10TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST), 2017, :299-309
[42]   Using Class Imbalance Learning for Software Defect Prediction [J].
Wang, Shuo ;
Yao, Xin .
IEEE TRANSACTIONS ON RELIABILITY, 2013, 62 (02) :434-443
[43]  
Watanabe S., 2008, P 4 INT WORKSH PRED, P19, DOI [10.1145/1370788.1370794, DOI 10.1145/1370788.1370794.50A.E.C]
[44]   Cross-company defect prediction via semi-supervised clustering-based data filtering and MSTrA-based transfer learning [J].
Yu, Xiao ;
Wu, Man ;
Jian, Yiheng ;
Bennin, Kwabena Ebo ;
Fu, Mandi ;
Ma, Chuanxiang .
SOFT COMPUTING, 2018, 22 (10) :3461-3472
[45]  
Zhang F., 2014, P 11 WORK C MIN SOFT, P182, DOI 10.1145/2597073.2597078
[46]   Cost-sensitive boosting neural networks for software defect prediction [J].
Zheng, Jun .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (06) :4537-4543
[47]  
Zhimin He, 2013, 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), P45, DOI 10.1109/ESEM.2013.20