Cross-Project Online Just-In-Time Software Defect Prediction

被引:16
|
作者
Tabassum, Sadia [1 ]
Minku, Leandro L. [1 ]
Feng, Danyi [2 ]
机构
[1] Univ Birminigham, Sch Comp Sci, Birmingham B15 2TT, England
[2] Xiliu Tech, Beijing 100050, Peoples R China
基金
英国工程与自然科学研究理事会;
关键词
Training; Software; Training data; Predictive models; Codes; Resource management; Open source software; Software defect prediction; cross-project learning; transfer learning; online learning; verification latency; concept drift; MACHINE;
D O I
10.1109/TSE.2022.3150153
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Cross-Project (CP) Just-In-Time Software Defect Prediction (JIT-SDP) makes use of CP data to overcome the lack of data necessary to train well performing JIT-SDP classifiers at the beginning of software projects. However, such approaches have never been investigated in realistic online learning scenarios, where Within-Project (WP) software changes naturally arrive over time and can be used to automatically update the classifiers. We provide the first investigation of when and to what extent CP data are useful for JIT-SDP in such realistic scenarios. For that, we propose three different online CP JIT-SDP approaches that can be updated with incoming CP and WP training examples over time. We also collect data on 9 proprietary software projects and use 10 open source software projects to analyse these approaches. We find that training classifiers with incoming CP+WP data can lead to absolute improvements in G-mean of up to 53.89% and up to 35.02% at the initial stage of the projects compared to classifiers using WP-only and CP-only data, respectively. Using CP+WP data was also shown to be beneficial after a large number of WP data were received. Using CP data to supplement WP data helped the classifiers to reduce or prevent large drops in predictive performance that may occur over time, leading to absolute G-Mean improvements of up to 37.35% and 48.16% compared to WP-only and CP-only data during such periods, respectively. During periods of stable predictive performance, absolute improvements were of up to 29.03% and up to 41.25% compared to WP-only and CP-only classifiers, respectively. Our results highlight the importance of using both CP and WP data together in realistic online JIT-SDP scenarios.
引用
收藏
页码:268 / 287
页数:20
相关论文
共 50 条
  • [1] An Investigation of Cross-Project Learning in Online Just-In-Time Software Defect Prediction
    Tabassum, Sadia
    Minku, Leandro L.
    Feng, Danyi
    Cabral, George G.
    Song, Liyan
    2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 554 - 565
  • [2] Online cross-project approach with project-level similarity for just-in-time software defect prediction
    Teng, Cong
    Song, Liyan
    Yao, Xin
    EMPIRICAL SOFTWARE ENGINEERING, 2024, 29 (06)
  • [3] Mobile Application Online Cross-Project Just-in-Time Software Defect Prediction Framework
    Jiang, Siyu
    He, Zhenhang
    chen, Yuwen
    Zhang, Mingrong
    Ma, Le
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (06)
  • [4] Towards Reliable Online Just-in-Time Software Defect Prediction
    Cabral, George G.
    Minku, Leandro L.
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2023, 49 (03) : 1342 - 1358
  • [5] The Impact of Data Merging on the Interpretation of Cross-Project Just-In-Time Defect Models
    Lin, Dayi
    Tantithamthavorn, Chakkrit
    Hassan, Ahmed E.
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (08) : 2969 - 2986
  • [6] A Survey on Cross-Project Software Defect Prediction Methods
    Chen X.
    Wang L.-P.
    Gu Q.
    Wang Z.
    Ni C.
    Liu W.-S.
    Wang Q.-P.
    2018, Science Press (41): : 254 - 274
  • [7] FENSE: A feature-based ensemble modeling approach to cross-project just-in-time defect prediction
    Zhang, Tanghaoran
    Yu, Yue
    Mao, Xinjun
    Lu, Yao
    Li, Zhixing
    Wang, Huaimin
    EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (07)
  • [8] Within-project and cross-project just-in-time defect prediction based on denoising autoencoder and convolutional neural network
    Zhu, Kun
    Zhang, Nana
    Ying, Shi
    Zhu, Dandan
    IET SOFTWARE, 2020, 14 (03) : 185 - 202
  • [9] A Practical Human Labeling Method for Online Just-in-Time Software Defect Prediction
    Song, Liyan
    Minku, Leandro Lei
    Teng, Cong
    Yao, Xin
    PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 605 - 617
  • [10] Cross-Project Software Defect Prediction Based on Class Code Similarity
    Wen, Wanzhi
    Shen, Chenqiang
    Lu, Xiaohong
    Li, Zhixian
    Wang, Haoren
    Zhang, Ruinian
    Zhu, Ningbo
    IEEE ACCESS, 2022, 10 : 105485 - 105495