Classification with reject option for software defect prediction

被引:23
|
作者
Mesquita, Diego P. P. [1 ]
Rocha, Lincoln S. [1 ]
Gomes, Joao Paulo P. [1 ]
Rocha Neto, Ajalmar R. [2 ]
机构
[1] Univ Fed Ceara, Dept Comp Sci, Fortaleza, Ceara, Brazil
[2] Fed Inst Ceara, Dept Teleinformat, Fortaleza, Ceara, Brazil
关键词
Software defect prediction; Classification with reject option; Extreme learning machines; EXTREME LEARNING-MACHINE; METRICS;
D O I
10.1016/j.asoc.2016.06.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Context: Software defect prediction (SDP) is an important task in software engineering. Along with estimating the number of defects remaining in software systems and discovering defect associations, classifying the defect-proneness of software modules plays an important role in software defect prediction. Several machine-learning methods have been applied to handle the defect-proneness of software modules as a classification problem. This type of yes or no decision is an important drawback in the decision-making process and if not precise may lead to misclassifications. To the best of our knowledge, existing approaches rely on fully automated module classification and do not provide a way to incorporate extra knowledge during the classification process. This knowledge can be helpful in avoiding misclassifications in cases where system modules cannot be classified in a reliable way. Objective:We seek to develop a SDP method that (i) incorporates a reject option in the classifier to improve the reliability in the decision-making process; and (ii) makes it possible postpone the final decision related to rejected modules for an expert analysis or even for another classifier using extra domain knowledge. Method: We develop a SDP method called rejoELM and its variant, IrejoELM. Both methods were built upon the weighted extreme learning machine (ELM) with reject option that makes it possible postpone the final decision of non-classified modules, the rejected ones, to another moment. While rejoELM aims to maximize the accuracy for a rejection rate, IrejoELM maximizes the F-measure. Hence, IrejoELM becomes an alternative for classification with reject option for imbalanced datasets. Results: rejoEM and IrejoELM are tested on five datasets of source code metrics extracted from real world open-source software projects. Results indicate that rejoELM has an accuracy for several rejection rates that is comparable to some state-of-the-art classifiers with reject option. Although IrejoELM shows lower accuracies for several rejection rates, it clearly outperforms all other methods when the F-measure is used as a performance metric. Conclusion: It is concluded that rejoELM is a valid alternative for classification with reject option problems when classes are nearly equally represented. On the other hand, IrejoELM is shown to be the best alternative for classification with reject option on imbalanced datasets. Since SDP problems are usually characterized as imbalanced learning problems, the use of IrejoELM is recommended. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:1085 / 1093
页数:9
相关论文
共 50 条
  • [11] Holistic Parameter Optimization for Software Defect Prediction
    Lee, Jaewook
    Choi, Jiwon
    Ryu, Duksan
    Kim, Suntae
    IEEE ACCESS, 2022, 10 : 106781 - 106797
  • [12] Software Defect Prediction with Naive Bayes Classifier
    Rahim, Aqsa
    Hayat, Zara
    Abbas, Muhammad
    Rahim, Amna
    Rahim, Muhammad Abdul
    PROCEEDINGS OF 2021 INTERNATIONAL BHURBAN CONFERENCE ON APPLIED SCIENCES AND TECHNOLOGIES (IBCAST), 2021, : 293 - 297
  • [13] On the use of deep learning in software defect prediction
    Giray, Gorkem
    Bennin, Kwabena Ebo
    Koksal, Omer
    Babur, Onder
    Tekinerdogan, Bedir
    JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 195
  • [14] Benchmarking classification models for software defect prediction: A proposed framework and novel findings
    Lessmann, Stefan
    Baesens, Bart
    Mues, Christophe
    Pietsch, Swantje
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2008, 34 (04) : 485 - 496
  • [15] Defect prediction for embedded software
    Oral, Atac Deniz
    Bener, Ayse Basar
    2007 22ND INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2007, : 346 - 351
  • [16] Leveraging an Enhanced CodeBERT-Based Model for Multiclass Software Defect Prediction via Defect Classification
    Hussain, Rida Ghafoor
    Yow, Kin-Choong
    Gori, Marco
    IEEE ACCESS, 2025, 13 : 24383 - 24397
  • [17] ELM and KELM based software defect prediction using feature selection techniques
    Arora, Ishani
    Saha, Anju
    JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2019, 40 (05) : 1025 - 1045
  • [18] An Ensemble Learning Approach for Software Defect Prediction in Developing Quality Software Product
    Saheed, Yakub Kayode
    Longe, Olumide
    Baba, Usman Ahmad
    Rakshit, Sandip
    Vajjhala, Narasimha Rao
    ADVANCES IN COMPUTING AND DATA SCIENCES, PT I, 2021, 1440 : 317 - 326
  • [19] A comparative study of software defect binomial classification prediction models based on machine learning
    Tao, Hongwei
    Niu, Xiaoxu
    Xu, Lang
    Fu, Lianyou
    Cao, Qiaoling
    Chen, Haoran
    Shang, Songtao
    Xian, Yang
    SOFTWARE QUALITY JOURNAL, 2024, 32 (03) : 1203 - 1237
  • [20] Software Defect Prediction Using Software Metrics - A survey
    Punitha, K.
    Chitra, S.
    2013 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2013, : 555 - 558