ARRAY: Adaptive triple feature-weighted transfer Naive Bayes for cross-project defect prediction

被引：6

作者：

Tong, Haonan ^{[1
]}

Lu, Wei ^{[1
]}

Xing, Weiwei ^{[1
]}

Wang, Shihai ^{[2
]}

机构：

[1] Beijing Jiaotong Univ, Sch Software Engn, Beijing 100044, Peoples R China

[2] Beihang Univ, Sch Reliabil & Syst Engn, Sci & Technol Reliabil & Environm Engn Lab, Beijing 100191, Peoples R China

来源：

JOURNAL OF SYSTEMS AND SOFTWARE | 2023年 / 202卷

关键词：

Cross-project defect prediction; Common metrics; Transfer learning; Feature weighting; Model adaptation; FEATURE-SELECTION; SOFTWARE DEFECTS; MODEL; QUALITY; SUITE;

D O I：

10.1016/j.jss.2023.111721

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Context: Cross-project defect prediction (CPDP) aims to predict defects of target data by using prediction models trained on the source dataset. However, owing to the huge distribution difference, it is still a challenge to build high-performance CPDP models. Objective: We propose a novel high-performance CPDP method named adaptive triple feature-weighted transfer naive Bayes (ARRAY). Methods: ARRAY is characterized by feature weighted similarity, feature weighted instance weight, and the model adaptive adjustment. Experiments are performed on 34 defect datasets. We compare ARRAY with seven state-of-the-art CPDP methods in terms of area under ROC curve (AUC), F1, and Matthews correlation coefficient (MCC) with statistical testing methods. Results: Experimental results show that: (1) on average, ARRAY separately improves MCC, AUC, and F1 over the baselines by at least 18.4%, 6.5%, and 4.5%; (2) ARRAY significantly performs better than each baseline on most datasets; (3) ARRAY significantly outperforms all baselines with non-negligible effect size according to post-hoc test. Conclusion: It can be concluded that: (1) the proposed feature weighted similarity, feature weighted instance weight, and the model adaptive adjustment are very helpful for improving the performance of CPDP models; (2) ARRAY is a more promising alternative for CPDP with common metrics. (c) 2023 Elsevier Inc. All rights reserved.

引用

页数：16

共 74 条

[31] Transfer learning for cross-company software defect prediction
Ma, Ying
Luo, Guangchun
Zeng, Xue
Chen, Aiguo
[J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2012, 54 (03) : 248 - 256
[32] Martinez-Cagigal V, 2021, Multiple Testing Toolbox Online post
[33] COMPARISON OF PREDICTED AND OBSERVED SECONDARY STRUCTURE OF T4 PHAGE LYSOZYME
MATTHEWS, BW
[J]. BIOCHIMICA ET BIOPHYSICA ACTA, 1975, 405 (02) : 442 - 451
[34] Defect prediction from static code features: current results, limitations, new approaches
Menzies, Tim
Milton, Zach
Turhan, Burak
Cukic, Bojan
Jiang, Yue
Bener, Ayse
[J]. AUTOMATED SOFTWARE ENGINEERING, 2010, 17 (04) : 375 - 407
[35] Heterogeneous Defect Prediction
Nam, Jaechang
Fu, Wei
Kim, Sunghun
Menzies, Tim
Tan, Lin
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2018, 44 (09) : 874 - 896
[36] Nam J, 2013, PROCEEDINGS OF THE 35TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2013), P382, DOI 10.1109/ICSE.2013.6606584
[37] Revisiting Supervised and Unsupervised Methods for Effort-Aware Cross-Project Defect Prediction
Ni, Chao
Xia, Xin
Lo, David
Chen, Xiang
Gu, Qing
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (03) : 786 - 802
[38] Niu J., 2022, SOFTWARE QUAL J, P1
[39] Domain Adaptation via Transfer Component Analysis
Pan, Sinno Jialin
Tsang, Ivor W.
Kwok, James T.
Yang, Qiang
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2011, 22 (02): : 199 - 210
[40] Panichella A, 2014, 2014 SOFTWARE EVOLUTION WEEK - IEEE CONFERENCE ON SOFTWARE MAINTENANCE, REENGINEERING, AND REVERSE ENGINEERING (CSMR-WCRE), P164, DOI 10.1109/CSMR-WCRE.2014.6747166

← 1 2 3 4 5 6 7 8 →