Cross-Project Online Just-In-Time Software Defect Prediction

被引:16
|
作者
Tabassum, Sadia [1 ]
Minku, Leandro L. [1 ]
Feng, Danyi [2 ]
机构
[1] Univ Birminigham, Sch Comp Sci, Birmingham B15 2TT, England
[2] Xiliu Tech, Beijing 100050, Peoples R China
基金
英国工程与自然科学研究理事会;
关键词
Training; Software; Training data; Predictive models; Codes; Resource management; Open source software; Software defect prediction; cross-project learning; transfer learning; online learning; verification latency; concept drift; MACHINE;
D O I
10.1109/TSE.2022.3150153
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Cross-Project (CP) Just-In-Time Software Defect Prediction (JIT-SDP) makes use of CP data to overcome the lack of data necessary to train well performing JIT-SDP classifiers at the beginning of software projects. However, such approaches have never been investigated in realistic online learning scenarios, where Within-Project (WP) software changes naturally arrive over time and can be used to automatically update the classifiers. We provide the first investigation of when and to what extent CP data are useful for JIT-SDP in such realistic scenarios. For that, we propose three different online CP JIT-SDP approaches that can be updated with incoming CP and WP training examples over time. We also collect data on 9 proprietary software projects and use 10 open source software projects to analyse these approaches. We find that training classifiers with incoming CP+WP data can lead to absolute improvements in G-mean of up to 53.89% and up to 35.02% at the initial stage of the projects compared to classifiers using WP-only and CP-only data, respectively. Using CP+WP data was also shown to be beneficial after a large number of WP data were received. Using CP data to supplement WP data helped the classifiers to reduce or prevent large drops in predictive performance that may occur over time, leading to absolute G-Mean improvements of up to 37.35% and 48.16% compared to WP-only and CP-only data during such periods, respectively. During periods of stable predictive performance, absolute improvements were of up to 29.03% and up to 41.25% compared to WP-only and CP-only classifiers, respectively. Our results highlight the importance of using both CP and WP data together in realistic online JIT-SDP scenarios.
引用
收藏
页码:268 / 287
页数:20
相关论文
共 50 条
  • [21] Using active learning selection approach for cross-project software defect prediction
    Mi, Wenbo
    Li, Yong
    Wen, Ming
    Chen, Youren
    CONNECTION SCIENCE, 2022, 34 (01) : 1482 - 1499
  • [22] On the validity of retrospective predictive performance evaluation procedures in just-in-time software defect prediction
    Song, Liyan
    Minku, Leandro L.
    Yao, Xin
    EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (05)
  • [23] Improving Cross-Project Software Defect Prediction Method Through Transformation and Feature Selection Approach
    Bala, Yahaya Zakariyau
    Samat, Pathiah Abdul
    Sharif, Khaironi Yatim
    Manshor, Noridayu
    IEEE ACCESS, 2023, 11 : 2318 - 2326
  • [24] Heterogeneous Cross-Project Defect Prediction via Optimal Transport
    Zong, Xing
    Li, Guiyu
    Zheng, Shang
    Zou, Haitao
    Yu, Hualong
    Gao, Shang
    IEEE ACCESS, 2023, 11 : 12015 - 12030
  • [25] Impact of hyper parameter optimization for cross-project software defect prediction
    Qu Y.
    Chen X.
    Zhao Y.
    Ju X.
    International Journal of Performability Engineering, 2018, 14 (06): : 1291 - 1299
  • [26] Assessing the Effect of Imbalanced Learning on Cross-project Software Defect Prediction
    Sohan, Md Fahimuzzman
    Jabiullah, Md Ismail
    Rahman, Sheikh Shah Mohammad Motiur
    Mahmud, S. M. Hasan
    2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,
  • [27] Just-in-time software defect prediction method for non-stationary and imbalanced data streams
    Wu, Qikai
    Wang, Xingqi
    Wei, Dan
    Chen, Bin
    Dang, Qingguo
    SOFTWARE QUALITY JOURNAL, 2025, 33 (01)
  • [28] A Procedure to Continuously Evaluate Predictive Performance of Just-In-Time Software Defect Prediction Models During Software Development
    Song, Liyan
    Minku, Leandro L.
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (02) : 646 - 666
  • [29] A Survey on Transfer Learning for Cross-Project Defect Prediction
    Sotto-Mayor, Bruno
    Kalech, Meir
    IEEE ACCESS, 2024, 12 : 93398 - 93425
  • [30] Cross-Project and Within-Project Semisupervised Software Defect Prediction: A Unified Approach
    Wu, Fei
    Jing, Xiao-Yuan
    Sun, Ying
    Sun, Jing
    Huang, Lin
    Cui, Fangyi
    Sun, Yanfei
    IEEE TRANSACTIONS ON RELIABILITY, 2018, 67 (02) : 581 - 597