Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learning Model

被引:3
作者
Alsaedi, Shatha Abed [1 ,2 ]
Noaman, Amin Yousef [1 ]
Gad-Elrab, Ahmed A. A. [1 ]
Eassa, Fathy Elbouraey [1 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Dept Comp Sci, Jeddah 21589, Saudi Arabia
[2] Taibah Univ, Coll Comp Sci & Engn, Dept Comp Sci, Yanbu 46421, Saudi Arabia
关键词
Computer bugs; Software maintenance; Predictive models; Machine learning; Ensemble learning; Maintenance engineering; Support vector machines; Natural language processing; nature classification; ensemble machine learning algorithm; natural language processing; bug reports; machine learning;
D O I
10.1109/ACCESS.2023.3288156
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In software development systems, the maintenance process of software systems attracted the attention of researchers due to its importance in fixing the defects discovered in the software testing by using bug reports (BRs) which include detailed information like description, status, reporter, assignee, priority, and severity of the bug and other information. The main problem in this process is how to analyze these BRs to discover all defects in the system, which is a tedious and time-consuming task if done manually because the number of BRs increases dramatically. Thus, the automated solution is the best. Most of the current research focuses on automating this process from different aspects, such as detecting the severity or priority of the bug. However, they did not consider the nature of the bug, which is a multi-class classification problem. This paper solves this problem by proposing a new prediction model to analyze BRs and predict the nature of the bug. The proposed model constructs an ensemble machine learning algorithm using natural language processing (NLP) and machine learning techniques. We simulate the proposed model by using a publicly available dataset for two online software bug repositories (Mozilla and Eclipse), which includes six classes: Program Anomaly, GUI, Network or Security, Configuration, Performance, and Test-Code. The simulation results show that the proposed model can achieve better accuracy than most existing models, namely, 90.42% without text augmentation and 96.72% with text augmentation.
引用
收藏
页码:63916 / 63931
页数:16
相关论文
共 51 条
  • [1] Adhikarla S., 2020, THESIS LINKOPING U D
  • [2] Aggarwal A., 2020, TYPES BUGS SOFTWARE
  • [3] CaPBug-A Framework for Automatic Bug Categorization and Prioritization Using NLP and Machine Learning Algorithms
    Ahmed, Hafiza Anisa
    Bawany, Narmeen Zakaria
    Shamsi, Jawwad Ahmed
    [J]. IEEE ACCESS, 2021, 9 (09): : 50496 - 50512
  • [4] Akhmetov I, 2020, COMPUT SIST, V24, P1353, DOI [10.13053/CyS-24-3-3775, 10.13053/cys-24-3-3775]
  • [5] Alenezi M, 2018, Arxiv, DOI [arXiv:1804.07803, 10.5121/ijsea.2018.9203, DOI 10.5121/IJSEA.2018.9203]
  • [6] Learning to rank developers for bug report assignment
    Alkhazi, Bader
    DiStasi, Andrew
    Aljedaani, Wajdi
    Alrubaye, Hussein
    Ye, Xin
    Mkaouer, Mohamed Wiem
    [J]. APPLIED SOFT COMPUTING, 2020, 95
  • [7] Ankit, 2018, Procedia Computer Science, V132, P937, DOI 10.1016/j.procs.2018.05.109
  • [8] [Anonymous], Support Vector Machines
  • [9] Bani-Salameh H., 2021, E INFORM SOFTW ENG J, V15, P1
  • [10] Bartosik A., 2021, The Era of Artificial Intelligence, Machine Learning, and Data Science in the Pharmaceutical Industry, P119, DOI [10.1016/B978-0-12-820045-2.00008-8, DOI 10.1016/B978-0-12-820045-2.00008-8]