A hybridization of multiple imputation and one-class bagging ensemble approach for missing value and class imbalance problem

被引:0
作者
Baro, Pranita [1 ]
Borah, Malaya Dutta [1 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Silchar 788010, Assam, India
关键词
Class imbalance; Missing values; Ensemble learning; One-class classification; Resampling; Hybrid; CLASSIFICATION; CLASSIFIERS; FRAMEWORK; FEATURES; MODEL; SMOTE;
D O I
10.1007/s12530-024-09602-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class imbalance in a dataset leads to erroneous outcomes that engrave the learning techniques and high misclassification cost in the minority class. Along with class imbalance, missing values present in a disproportionate amount in a dataset create a great hindrance to the effective performance of a method. The ensemble method, where multiple methods are ensembled, tackles such issues and shows good results compared to the performance of individual methods. In this paper, a hybridization of multiple imputation and one-class bagging ensemble approach is proposed that handles datasets having both class imbalance and missing values. An in-depth analysis of this approach is studied and the effectiveness of the class imbalance is also presented. To tackle the misclassification of minority samples and missing values, factor-based multiple imputation oversampling technique is used and one-class classifier is ensembled to increase the performance of the class imbalance datasets. Experiments are performed using a one-class support vector machine classifier and the results are evaluated using metrics: Recall (Detection rate), Specificity, f-measure, g-mean, AUC, and Precision. The proposed approach yields a 6.3% improvement in Recall, whereas Specificity, f-measure, g-mean, AUC, and Precision show that the proposed approach improves by 4.92%, 11.3%, 9.4%, 8.3%, and 8.03%, respectively.
引用
收藏
页码:2021 / 2066
页数:46
相关论文
共 88 条
  • [1] Combining weighted SMOTE with ensemble learning for the class-imbalanced prediction of small business credit risk
    Abedin, Mohammad Zoynul
    Guotai, Chi
    Hajek, Petr
    Zhang, Tong
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (04) : 3559 - 3579
  • [2] Effective One-Class Classifier Model for Memory Dump Malware Detection
    Al-Qudah, Mahmoud
    Ashi, Zein
    Alnabhan, Mohammad
    Abu Al-Haija, Qasem
    [J]. JOURNAL OF SENSOR AND ACTUATOR NETWORKS, 2023, 12 (01)
  • [3] Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
  • [4] Aleryani A., 2020, SN Comput. Sci., V1, P1, DOI [10.1007/s42979-020-00131-0, DOI 10.1007/S42979-020-00131-0, 10.1007/S42979-020-00131-0, https://doi.org/10.1007/s42979-020-00131-0]
  • [5] Angelov P, 2017, IEEE INT C CYBERNET, P436
  • [6] Anguita D., 2012, ESANN, P441, DOI DOI 10.1007/S11042-019-08345-Y
  • [7] Armah Gabriel Kofi, 2014, International Journal of Machine Learning and Computing, V4, P417, DOI 10.7763/IJMLC.2014.V4.447
  • [8] New applications of ensembles of classifiers
    Barandela, R
    Sánchez, JS
    Valdovinos, RM
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2003, 6 (03) : 245 - 256
  • [9] Baro Pranita, 2023, Procedia Computer Science, P103, DOI 10.1016/j.procs.2022.12.406
  • [10] Baro Pranita, 2022, 2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON), P1, DOI 10.1109/UPCON56432.2022.9986452