Classification with reject option for software defect prediction

被引:23
|
作者
Mesquita, Diego P. P. [1 ]
Rocha, Lincoln S. [1 ]
Gomes, Joao Paulo P. [1 ]
Rocha Neto, Ajalmar R. [2 ]
机构
[1] Univ Fed Ceara, Dept Comp Sci, Fortaleza, Ceara, Brazil
[2] Fed Inst Ceara, Dept Teleinformat, Fortaleza, Ceara, Brazil
关键词
Software defect prediction; Classification with reject option; Extreme learning machines; EXTREME LEARNING-MACHINE; METRICS;
D O I
10.1016/j.asoc.2016.06.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Context: Software defect prediction (SDP) is an important task in software engineering. Along with estimating the number of defects remaining in software systems and discovering defect associations, classifying the defect-proneness of software modules plays an important role in software defect prediction. Several machine-learning methods have been applied to handle the defect-proneness of software modules as a classification problem. This type of yes or no decision is an important drawback in the decision-making process and if not precise may lead to misclassifications. To the best of our knowledge, existing approaches rely on fully automated module classification and do not provide a way to incorporate extra knowledge during the classification process. This knowledge can be helpful in avoiding misclassifications in cases where system modules cannot be classified in a reliable way. Objective:We seek to develop a SDP method that (i) incorporates a reject option in the classifier to improve the reliability in the decision-making process; and (ii) makes it possible postpone the final decision related to rejected modules for an expert analysis or even for another classifier using extra domain knowledge. Method: We develop a SDP method called rejoELM and its variant, IrejoELM. Both methods were built upon the weighted extreme learning machine (ELM) with reject option that makes it possible postpone the final decision of non-classified modules, the rejected ones, to another moment. While rejoELM aims to maximize the accuracy for a rejection rate, IrejoELM maximizes the F-measure. Hence, IrejoELM becomes an alternative for classification with reject option for imbalanced datasets. Results: rejoEM and IrejoELM are tested on five datasets of source code metrics extracted from real world open-source software projects. Results indicate that rejoELM has an accuracy for several rejection rates that is comparable to some state-of-the-art classifiers with reject option. Although IrejoELM shows lower accuracies for several rejection rates, it clearly outperforms all other methods when the F-measure is used as a performance metric. Conclusion: It is concluded that rejoELM is a valid alternative for classification with reject option problems when classes are nearly equally represented. On the other hand, IrejoELM is shown to be the best alternative for classification with reject option on imbalanced datasets. Since SDP problems are usually characterized as imbalanced learning problems, the use of IrejoELM is recommended. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:1085 / 1093
页数:9
相关论文
共 50 条
  • [31] Software defect number prediction: Unsupervised vs supervised methods
    Chen, Xiang
    Zhang, Dun
    Zhao, Yingquan
    Cui, Zhanqi
    Ni, Chao
    INFORMATION AND SOFTWARE TECHNOLOGY, 2019, 106 : 161 - 181
  • [32] A Comprehensive Investigation of the Role of Imbalanced Learning for Software Defect Prediction
    Song, Qinbao
    Guo, Yuchen
    Shepperd, Martin
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2019, 45 (12) : 1253 - 1269
  • [33] A novel software defect prediction approach via weighted classification based on association rule mining
    Wu, Wentao
    Wang, Shihai
    Liu, Bin
    Shao, Yuanxun
    Xie, Wandong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 129
  • [34] Software defect association mining and defect correction effort prediction
    Song, QB
    Shepperd, M
    Cartwright, M
    Mair, C
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2006, 32 (02) : 69 - 82
  • [35] The Stability of Threshold Values for Software Metrics in Software Defect Prediction
    Mausa, Goran
    Grbac, Tihana Galinac
    MODEL AND DATA ENGINEERING (MEDI 2017), 2017, 10563 : 81 - 95
  • [36] A Survey on Software Defect Prediction in Cross Project
    Jadhav, Rohini
    Joshi, Shashank. D.
    Thorat, Umesh
    Joshi, Aditi S.
    PROCEEDINGS OF THE 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2019, : 1014 - 1019
  • [37] Ensemble learning based software defect prediction
    Dong, Xin
    Liang, Yan
    Miyamoto, Shoichiro
    Yamaguchi, Shingo
    JOURNAL OF ENGINEERING RESEARCH, 2023, 11 (04): : 377 - 391
  • [38] An Attribute Selection Process for Software Defect Prediction
    Khan, Jobaer Islam
    Ul Gias, Alim
    Siddik, Md. Saeed
    Rahman, Md. Habibur
    Khaled, Shah Mostafa
    Shoyaib, Mohammad
    2014 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2014,
  • [39] Software Defect Prediction Based on Fourier Learning
    Yang, Kang
    Yu, Huiqun
    Fan, Guisheng
    Yang, Xingguang
    Zheng, Song
    Leng, Chunxia
    PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), 2018, : 388 - 392
  • [40] Software Defect Prediction using Hybrid Approach
    Thant, Myo Wai
    Aung, Nyein Thwet Thwet
    2019 INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION TECHNOLOGIES (ICAIT), 2019, : 262 - 267