Just-in-time software defect prediction method for non-stationary and imbalanced data streams

被引:0
|
作者
Wu, Qikai [1 ]
Wang, Xingqi [1 ,2 ]
Wei, Dan [1 ,2 ]
Chen, Bin [1 ,2 ]
Dang, Qingguo [1 ]
机构
[1] Hangzhou Dianzi Univ, Sch Comp Sci, Hangzhou 310018, Peoples R China
[2] Key Lab Discrete Ind Internet Things Zhejiang Prov, Hangzhou, Peoples R China
关键词
Just-in-time software defect prediction; Online learning; Concept drift; Verification latency; Class imbalance learning;
D O I
10.1007/s11219-025-09711-w
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Compared to traditional software defect prediction, Just-In-Time Software Defect Prediction (JIT-SDP) is a more fine-grained software defect prediction method used for defect prediction at the software change level. However, JIT software defect datasets in online data stream scenarios suffer from issues like validation delay, concept drift, and class imbalance evolution, which severely impact the predictive performance of JIT-SDP. This paper introduces a just-in-time software defect prediction method for non-stationary and imbalanced data streams, JNAI (JIT-SDP method for Non-stationary And Imbalanced data streams). This method solves validation delays, concept drifts, and class imbalance issues in existing JIT software defect processing technology. It proposes a validation delay framework to correct data labels, and a concept drift adaptation mechanism that combines intra-project and cross-project data filtering to mitigate concept drift while avoiding prediction bias caused by cross-project data. Next, a dynamic classifier selection method integrating a tiered AdaBoost is designed, using classifiers trained on preceding data to predict subsequent data labels iteratively, thereby addressing the issue of class distribution imbalance in data streams. Finally, the Hoeffding Tree is selected as the base classifier, and the processed dataset is used to train it, forming the final model of the just-in-time software defect prediction method. Experiments were conducted on six public JIT-SDP datasets and ten open-source GitHub projects, and the results show that JNAI effectively improves the predictive performance of just-in-time software defect prediction.
引用
收藏
页数:34
相关论文
共 50 条
  • [1] Towards Reliable Online Just-in-Time Software Defect Prediction
    Cabral, George G.
    Minku, Leandro L.
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (03) : 1342 - 1358
  • [2] A Practical Human Labeling Method for Online Just-in-Time Software Defect Prediction
    Song, Liyan
    Minku, Leandro Lei
    Teng, Cong
    Yao, Xin
    PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 605 - 617
  • [3] Cross-Project Online Just-In-Time Software Defect Prediction
    Tabassum, Sadia
    Minku, Leandro L.
    Feng, Danyi
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (01) : 268 - 287
  • [4] On the validity of retrospective predictive performance evaluation procedures in just-in-time software defect prediction
    Liyan Song
    Leandro L. Minku
    Xin Yao
    Empirical Software Engineering, 2023, 28
  • [5] On the validity of retrospective predictive performance evaluation procedures in just-in-time software defect prediction
    Song, Liyan
    Minku, Leandro L.
    Yao, Xin
    EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (05)
  • [6] Class Imbalance Evolution and Verification Latency in Just-in-Time Software Defect Prediction
    Cabral, George G.
    Minku, Leandro L.
    Shihab, Emad
    Mujahid, Suhaib
    2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2019), 2019, : 666 - 676
  • [7] A Procedure to Continuously Evaluate Predictive Performance of Just-In-Time Software Defect Prediction Models During Software Development
    Song, Liyan
    Minku, Leandro L.
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (02) : 646 - 666
  • [8] A Systematic Survey of Just-in-Time Software Defect Prediction
    Zhao, Yunhua
    Damevski, Kostadin
    Chen, Hui
    ACM COMPUTING SURVEYS, 2023, 55 (10)
  • [9] An empirical study of data sampling techniques for just-in-time software defect prediction
    Li, Zhiqiang
    Du, Qiannan
    Zhang, Hongyu
    Jing, Xiao-Yuan
    Wu, Fei
    AUTOMATED SOFTWARE ENGINEERING, 2024, 31 (02)
  • [10] An Investigation of Cross-Project Learning in Online Just-In-Time Software Defect Prediction
    Tabassum, Sadia
    Minku, Leandro L.
    Feng, Danyi
    Cabral, George G.
    Song, Liyan
    2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 554 - 565