ARRAY: Adaptive triple feature-weighted transfer Naive Bayes for cross-project defect prediction

被引:6
作者
Tong, Haonan [1 ]
Lu, Wei [1 ]
Xing, Weiwei [1 ]
Wang, Shihai [2 ]
机构
[1] Beijing Jiaotong Univ, Sch Software Engn, Beijing 100044, Peoples R China
[2] Beihang Univ, Sch Reliabil & Syst Engn, Sci & Technol Reliabil & Environm Engn Lab, Beijing 100191, Peoples R China
关键词
Cross-project defect prediction; Common metrics; Transfer learning; Feature weighting; Model adaptation; FEATURE-SELECTION; SOFTWARE DEFECTS; MODEL; QUALITY; SUITE;
D O I
10.1016/j.jss.2023.111721
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Context: Cross-project defect prediction (CPDP) aims to predict defects of target data by using prediction models trained on the source dataset. However, owing to the huge distribution difference, it is still a challenge to build high-performance CPDP models. Objective: We propose a novel high-performance CPDP method named adaptive triple feature-weighted transfer naive Bayes (ARRAY). Methods: ARRAY is characterized by feature weighted similarity, feature weighted instance weight, and the model adaptive adjustment. Experiments are performed on 34 defect datasets. We compare ARRAY with seven state-of-the-art CPDP methods in terms of area under ROC curve (AUC), F1, and Matthews correlation coefficient (MCC) with statistical testing methods. Results: Experimental results show that: (1) on average, ARRAY separately improves MCC, AUC, and F1 over the baselines by at least 18.4%, 6.5%, and 4.5%; (2) ARRAY significantly performs better than each baseline on most datasets; (3) ARRAY significantly outperforms all baselines with non-negligible effect size according to post-hoc test. Conclusion: It can be concluded that: (1) the proposed feature weighted similarity, feature weighted instance weight, and the model adaptive adjustment are very helpful for improving the performance of CPDP models; (2) ARRAY is a more promising alternative for CPDP with common metrics. (c) 2023 Elsevier Inc. All rights reserved.
引用
收藏
页数:16
相关论文
共 74 条
  • [41] On the performance of method-level bug prediction: A negative result
    Pascarella, Luca
    Palomba, Fabio
    Bacchelli, Alberto
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2020, 161
  • [42] Data gravitation based classification
    Peng, Lizhi
    Yang, Bo
    Chen, Yuehui
    Abraham, Ajith
    [J]. INFORMATION SCIENCES, 2009, 179 (06) : 809 - 819
  • [43] Balancing Privacy and Utility in Cross-Company Defect Prediction
    Peters, Fayola
    Menzies, Tim
    Gong, Liang
    Zhang, Hongyu
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2013, 39 (08) : 1054 - 1068
  • [44] Detecting Novel Associations in Large Data Sets
    Reshef, David N.
    Reshef, Yakir A.
    Finucane, Hilary K.
    Grossman, Sharon R.
    McVean, Gilean
    Turnbaugh, Peter J.
    Lander, Eric S.
    Mitzenmacher, Michael
    Sabeti, Pardis C.
    [J]. SCIENCE, 2011, 334 (6062) : 1518 - 1524
  • [45] Effective multi-objective naive Bayes learning for cross-project defect prediction
    Ryu, Duksan
    Baik, Jongmoon
    [J]. APPLIED SOFT COMPUTING, 2016, 49 : 1062 - 1077
  • [46] Value-cognitive boosting with a support vector machine for cross-project defect prediction
    Ryu, Duksan
    Choi, Okjoo
    Baik, Jongmoon
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2016, 21 (01) : 43 - 71
  • [47] A Hybrid Instance Selection Using Nearest-Neighbor for Cross-Project Defect Prediction
    Ryu, Duksan
    Jang, Jong-In
    Baik, Jongmoon
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2015, 30 (05) : 969 - +
  • [48] How far does the predictive decision impact the software project? The cost, service time, and failure analysis from a cross-project defect prediction model
    Sharma, B. Umamaheswara
    Sadam, Ravichandra
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 195
  • [49] Data Quality: Some Comments on the NASA Software Defect Datasets
    Shepperd, Martin
    Song, Qinbao
    Sun, Zhongbin
    Mair, Carolyn
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2013, 39 (09) : 1208 - 1215
  • [50] A Comprehensive Investigation of the Role of Imbalanced Learning for Software Defect Prediction
    Song, Qinbao
    Guo, Yuchen
    Shepperd, Martin
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2019, 45 (12) : 1253 - 1269