Within-Project Software Aging Defect Prediction Based on Active Learning

被引:4
作者
Liang, Mengting [1 ]
Li, Dimeng [1 ]
Xu, Bin [1 ]
Zhao, Dongdong [1 ]
Yu, Xiao [1 ]
Xiang, Jianwen [1 ]
机构
[1] Wuhan Univ Technol, Sch Comp & Artificial Intelligence, Wuhan, Peoples R China
来源
2021 IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSREW 2021) | 2021年
关键词
software aging; aging-related bugs prediction; active learning; hashing-based undersampling ensemble;
D O I
10.1109/ISSREW53611.2021.00037
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Long-running software systems tend to exhibit performance degradation and increase failure rate, and the phenomenon is known as software aging. The bugs that cause the aging phenomenon are called Aging-Related Bugs (ARBs), and may bring serious economic loss or even endanger human security. To discover and remove ARBs, ARBs prediction is presented. But ARBs prediction model often needs a large number of training data in order to train a high performance classification model. In practice, the labeled data are rare in many cases. In addition, it is difficult to label all samples manually. Furthermore, there is a serious class imbalance problem in ARBs datasets. In order to address the two problems, we propose a framework named QUIRE-HUE. On the one hand, we use a approach named Active Learning by Querying Informative and Representative Examples (QUIRE) to select a few informative and representative samples to label for training set, which can reduce the cost of labeling and get a high performance classification model. On the other hand, we apply a Hashing-Based Undersampling Ensemble (HUE) by constructing diversified training subspaces for undersampling to alleviate class imbalance problem. A set of experiments are performed on two large open-source projects (MySQL, Linux) with six different machine learning classifiers. We use Balance and AUC as the evaluation metrics. Experimental results indicate that QUIRE-HUE achieves encouraging results. Average AUC and Balance are 0.769 and 0.812 respectively on MySQL dataset, 0.772 and 0.828 respectively on Linux dataset, which significantly outperforms all baseline methods.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 47 条
  • [21] HUANG YN, 1995, DIG PAP INT SYMP FAU, P381, DOI 10.1109/FTCS.1995.466961
  • [22] Feature Selection Techniques to Counter Class Imbalance Problem for Aging Related Bug Prediction Aging Related Bug Prediction
    Kumar, Lov
    Sureka, Ashish
    [J]. ISEC'18: PROCEEDINGS OF THE 11TH INNOVATIONS IN SOFTWARE ENGINEERING CONFERENCE, 2018,
  • [23] Sample-based software defect prediction with active and semi-supervised learning
    Li, Ming
    Zhang, Hongyu
    Wu, Rongxin
    Zhou, Zhi-Hua
    [J]. AUTOMATED SOFTWARE ENGINEERING, 2012, 19 (02) : 201 - 230
  • [24] A systematic review of unsupervised learning techniques for software defect prediction
    Li, Ning
    Shepperd, Martin
    Guo, Yuchen
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2020, 122 (122)
  • [25] Effort-Aware semi-Supervised just-in-Time defect prediction
    Li, Weiwei
    Zhang, Wenzhou
    Jia, Xiuyi
    Huang, Zhiqiu
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2020, 126
  • [26] A two-phase transfer learning model for cross-project defect prediction
    Liu, Chao
    Yang, Dan
    Xia, Xin
    Yan, Meng
    Zhang, Xiaohong
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2019, 107 : 125 - 136
  • [27] Lu Huihua., 2012, Proceedings of the 8th International Conference on Predictive Models in Software Engineering, PROMISE '12, P79
  • [28] Active Learning for Software Defect Prediction
    Luo, Guangchun
    Ma, Ying
    Qin, Ke
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (06) : 1680 - 1683
  • [29] Software Defect Prediction Using Ensemble Learning: A Systematic Literature Review
    Matloob, Faseeha
    Ghazal, Taher M.
    Taleb, Nasser
    Aftab, Shabib
    Ahmad, Munir
    Khan, Muhammad Adnan
    Abbas, Sagheer
    Soomro, Tariq Rahim
    [J]. IEEE ACCESS, 2021, 9 : 98754 - 98771
  • [30] Hashing-Based Undersampling Ensemble for Imbalanced Pattern Classification Problems
    Ng, Wing W. Y.
    Xu, Shichao
    Zhang, Jianjun
    Tian, Xing
    Rong, Tongwen
    Kwong, Sam
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (02) : 1269 - 1279