Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learning Model

被引：3

作者：

Alsaedi, Shatha Abed ^{[1
,2
]}

Noaman, Amin Yousef ^{[1
]}

Gad-Elrab, Ahmed A. A. ^{[1
]}

Eassa, Fathy Elbouraey ^{[1
]}

机构：

[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Dept Comp Sci, Jeddah 21589, Saudi Arabia

[2] Taibah Univ, Coll Comp Sci & Engn, Dept Comp Sci, Yanbu 46421, Saudi Arabia

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Computer bugs; Software maintenance; Predictive models; Machine learning; Ensemble learning; Maintenance engineering; Support vector machines; Natural language processing; nature classification; ensemble machine learning algorithm; natural language processing; bug reports; machine learning;

D O I：

10.1109/ACCESS.2023.3288156

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In software development systems, the maintenance process of software systems attracted the attention of researchers due to its importance in fixing the defects discovered in the software testing by using bug reports (BRs) which include detailed information like description, status, reporter, assignee, priority, and severity of the bug and other information. The main problem in this process is how to analyze these BRs to discover all defects in the system, which is a tedious and time-consuming task if done manually because the number of BRs increases dramatically. Thus, the automated solution is the best. Most of the current research focuses on automating this process from different aspects, such as detecting the severity or priority of the bug. However, they did not consider the nature of the bug, which is a multi-class classification problem. This paper solves this problem by proposing a new prediction model to analyze BRs and predict the nature of the bug. The proposed model constructs an ensemble machine learning algorithm using natural language processing (NLP) and machine learning techniques. We simulate the proposed model by using a publicly available dataset for two online software bug repositories (Mozilla and Eclipse), which includes six classes: Program Anomaly, GUI, Network or Security, Configuration, Performance, and Test-Code. The simulation results show that the proposed model can achieve better accuracy than most existing models, namely, 90.42% without text augmentation and 96.72% with text augmentation.

引用

页码：63916 / 63931

页数：16

共 51 条

[1] Adhikarla S., 2020, THESIS LINKOPING U D
[2] Aggarwal A., 2020, TYPES BUGS SOFTWARE
[3] CaPBug-A Framework for Automatic Bug Categorization and Prioritization Using NLP and Machine Learning Algorithms
Ahmed, Hafiza Anisa
Bawany, Narmeen Zakaria
Shamsi, Jawwad Ahmed
[J]. IEEE ACCESS, 2021, 9 (09): : 50496 - 50512
[4] Akhmetov I, 2020, COMPUT SIST, V24, P1353, DOI [10.13053/CyS-24-3-3775, 10.13053/cys-24-3-3775]
[5] Alenezi M, 2018, Arxiv, DOI [arXiv:1804.07803, 10.5121/ijsea.2018.9203, DOI 10.5121/IJSEA.2018.9203]
[6] Learning to rank developers for bug report assignment
Alkhazi, Bader
DiStasi, Andrew
Aljedaani, Wajdi
Alrubaye, Hussein
Ye, Xin
Mkaouer, Mohamed Wiem
[J]. APPLIED SOFT COMPUTING, 2020, 95
[7] Ankit, 2018, Procedia Computer Science, V132, P937, DOI 10.1016/j.procs.2018.05.109
[8] [Anonymous], Support Vector Machines
[9] Bani-Salameh H., 2021, E INFORM SOFTW ENG J, V15, P1
[10] Bartosik A., 2021, The Era of Artificial Intelligence, Machine Learning, and Data Science in the Pharmaceutical Industry, P119, DOI [10.1016/B978-0-12-820045-2.00008-8, DOI 10.1016/B978-0-12-820045-2.00008-8]

← 1 2 3 4 5 6 →