Studying the effectiveness of deep active learning in software defect prediction

被引:2
|
作者
Feyzi F. [1 ]
Daneshdoost A. [1 ]
机构
[1] Faculty of Engineering, University of Guilan, Rasht
关键词
active learning; Bug prediction; code metrics; deep learning;
D O I
10.1080/1206212X.2023.2252117
中图分类号
学科分类号
摘要
Accurate prediction of defective software modules is of great importance for prioritizing quality assurance efforts, reasonably allocating testing resources, reducing costs and improving software quality. Several studies have used machine learning to predict software defects. However, complex structures and imbalanced class distributions in software defect data make learning an effective defect prediction model challenging. In this article, two deep learning-based defect prediction models using static code metrics are proposed. In order to enhance the learning process and improve the performance of the proposed models, pool-based active learning is employed. In this regard, the possibility of using active learning to mitigate the need for a large amount of labeled data in the process of building deep learning models is investigated. To deal with imbalanced distribution of software modules between defective and non-defective classes, Near-Miss under-sampling and KNN, with different number of neighbors, are used. The reason for choosing them is their good performance in binary classification problems. Experiments are performed on two well-known, publicly available datasets, GitHub Bug Dataset and public Unified Bug Dataset for java projects. The evaluation results reveal the effectiveness of our proposed models in comparison to the traditional machine learning algorithms. In the conducted investigations on the Unified Bug Dataset, at the file level, the value of F-measure and AUC criteria have improved by 13 and 11 percent, respectively and at the class level, the values have improved by 14 and 11 percent, respectively. © 2023 Informa UK Limited, trading as Taylor & Francis Group.
引用
收藏
页码:534 / 552
页数:18
相关论文
共 50 条
  • [41] Data-efficient software defect prediction: A comparative analysis of active learning-enhanced models and voting ensembles
    Liapis, Charalampos M.
    Karanikola, Aikaterini
    Kotsiantis, Sotiris
    INFORMATION SCIENCES, 2024, 676
  • [42] The Influence of Deep Learning Algorithms Factors in Software Fault Prediction
    Al Qasem, Osama
    Akour, Mohammed
    Alenezi, Mamdouh
    IEEE ACCESS, 2020, 8 (08): : 63945 - 63960
  • [43] Software change-proneness prediction based on deep learning
    Zhu, Xiaoyan
    Li, Nan
    Wang, Yong
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2022, 34 (04)
  • [44] Bayesian deep-learning for RUL prediction: An active learning perspective
    Zhu, Rong
    Chen, Yuan
    Peng, Weiwen
    Ye, Zhi-Sheng
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2022, 228
  • [45] Deep Learning-Based Software Defect Prediction via Semantic Key Features of Source Code-Systematic Survey
    Abdu, Ahmed
    Zhai, Zhengjun
    Algabri, Redhwan
    Abdo, Hakim A.
    Hamad, Kotiba
    Al-antari, Mugahed A.
    MATHEMATICS, 2022, 10 (17)
  • [46] MPT-embedding: An unsupervised representation learning of code for software defect prediction
    Shi, Ke
    Lu, Yang
    Liu, Guangliang
    Wei, Zhenchun
    Chang, Jingfei
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2021, 33 (04)
  • [47] A Survey of Deep Active Learning
    Ren, Pengzhen
    Xiao, Yun
    Chang, Xiaojun
    Huang, Po-Yao
    Li, Zhihui
    Gupta, Brij B.
    Chen, Xiaojiang
    Wang, Xin
    ACM COMPUTING SURVEYS, 2022, 54 (09)
  • [48] Clone consistent-defect prediction based on deep learning method
    Zhang, Fanlong
    Che, Yi
    Liang, Tiancai
    Jiang, Wenchao
    INFORMATION SCIENCES, 2023, 633 : 357 - 369
  • [49] Just-in-time software defect prediction using deep temporal convolutional networks
    Pasquale Ardimento
    Lerina Aversano
    Mario Luca Bernardi
    Marta Cimitile
    Martina Iammarino
    Neural Computing and Applications, 2022, 34 : 3981 - 4001
  • [50] Just-in-time software defect prediction using deep temporal convolutional networks
    Ardimento, Pasquale
    Aversano, Lerina
    Bernardi, Mario Luca
    Cimitile, Marta
    Iammarino, Martina
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (05) : 3981 - 4001