Within-Project Software Aging Defect Prediction Based on Active Learning

被引：4

作者：

Liang, Mengting ^{[1
]}

Li, Dimeng ^{[1
]}

Xu, Bin ^{[1
]}

Zhao, Dongdong ^{[1
]}

Yu, Xiao ^{[1
]}

Xiang, Jianwen ^{[1
]}

机构：

[1] Wuhan Univ Technol, Sch Comp & Artificial Intelligence, Wuhan, Peoples R China

来源：

2021 IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSREW 2021) | 2021年

关键词：

software aging; aging-related bugs prediction; active learning; hashing-based undersampling ensemble;

D O I：

10.1109/ISSREW53611.2021.00037

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Long-running software systems tend to exhibit performance degradation and increase failure rate, and the phenomenon is known as software aging. The bugs that cause the aging phenomenon are called Aging-Related Bugs (ARBs), and may bring serious economic loss or even endanger human security. To discover and remove ARBs, ARBs prediction is presented. But ARBs prediction model often needs a large number of training data in order to train a high performance classification model. In practice, the labeled data are rare in many cases. In addition, it is difficult to label all samples manually. Furthermore, there is a serious class imbalance problem in ARBs datasets. In order to address the two problems, we propose a framework named QUIRE-HUE. On the one hand, we use a approach named Active Learning by Querying Informative and Representative Examples (QUIRE) to select a few informative and representative samples to label for training set, which can reduce the cost of labeling and get a high performance classification model. On the other hand, we apply a Hashing-Based Undersampling Ensemble (HUE) by constructing diversified training subspaces for undersampling to alleviate class imbalance problem. A set of experiments are performed on two large open-source projects (MySQL, Linux) with six different machine learning classifiers. We use Balance and AUC as the evaluation metrics. Experimental results indicate that QUIRE-HUE achieves encouraging results. Average AUC and Balance are 0.769 and 0.812 respectively on MySQL dataset, 0.772 and 0.828 respectively on Linux dataset, which significantly outperforms all baseline methods.

引用

页码：1 / 8

页数：8

共 47 条

[41] Deep Semantic Feature Learning for Software Defect Prediction
Wang, Song
Liu, Taiyue
Nam, Jaechang
Tan, Lin
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2020, 46 (12) : 1267 - 1293
[42] Analysis of Software Aging in Android
Weng, Caisheng
Xiang, Jianwen
Xiong, Shengwu
Zhao, Dongdong
Yang, Chunhui
[J]. 2016 IEEE 27TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSREW), 2016, : 78 - 83
[43] PRINCIPAL COMPONENT ANALYSIS
WOLD, S
ESBENSEN, K
GELADI, P
[J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1987, 2 (1-3) : 37 - 52
[44] Xu Z, 2003, LECT NOTES COMPUT SC, V2633, P393
[45] Software defect prediction based on kernel PCA and weighted extreme learning machine
Xu, Zhou
Liu, Jin
Luo, Xiapu
Yang, Zijiang
Zhang, Yifeng
Yuan, Peipei
Tang, Yutian
Zhang, Tao
[J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2019, 106 : 182 - 200
[46] Improving Ranking-Oriented Defect Prediction Using a Cost-Sensitive Ranking SVM
Yu, Xiao
Liu, Jin
Keung, Jacky Wai
Li, Qing
Bennin, Kwabena Ebo
Xu, Zhou
Wang, Junping
Cui, Xiaohui
[J]. IEEE TRANSACTIONS ON RELIABILITY, 2020, 69 (01) : 139 - 153
[47] Improving defect prediction with deep forest
Zhou, Tianchi
Sun, Xiaobing
Xia, Xin
Li, Bin
Chen, Xiang
[J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2019, 114 : 204 - 216

← 1 2 3 4 5 →