Improving Minority Class Recall through a Novel Cluster-Based Oversampling Technique

被引:0
|
作者
Prexawanprasut, Takorn [1 ]
Banditwattanawong, Thepparit [1 ]
机构
[1] Kasetsart Univ, Dept Comp Sci, Krung Thep Maha Nakhon 10900, Thailand
来源
INFORMATICS-BASEL | 2024年 / 11卷 / 02期
关键词
imbalanced data; cluster-based oversampling; SMOTE; inter-cluster confusion; SMOTE;
D O I
10.3390/informatics11020035
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this study, we propose an approach to address the pressing issue of false negative errors by enhancing minority class recall within imbalanced data sets commonly encountered in machine learning applications. Through the utilization of a cluster-based oversampling technique in conjunction with an information entropy evaluation, our approach effectively targets areas of ambiguity inherent in the data set. An extensive evaluation across a diverse range of real-world data sets characterized by inter-cluster complexity demonstrates the superior performance of our method compared to that of existing oversampling techniques. Particularly noteworthy is its significant improvement within the Delinquency Telecom data set, where it achieves a remarkable increase of up to 30.54 percent in minority class recall compared to the original data set. This notable reduction in false negative errors underscores the importance of our methodology in accurately identifying and classifying instances from underrepresented classes, thereby enhancing model performance in imbalanced data scenarios.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients
    Santos, Miriam Seoane
    Abreu, Pedro Henriques
    Garcia-Laencina, Pedro J.
    Simao, Adelia
    Carvalho, Armando
    JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 58 : 49 - 59
  • [2] Cluster-based oversampling with area extraction from representative points for class imbalance learning
    Farou, Zakarya
    Wang, Yizhi
    Horvath, Tomas
    INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 22
  • [3] A novel oversampling technique based on the manifold distance for class imbalance learning
    Guo, Yinan
    Jiao, Botao
    Yang, Lingkai
    Cheng, Jian
    Yang, Shengxiang
    Tang, Fengzhen
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2021, 18 (03) : 131 - 142
  • [4] An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification Problem
    Wang, Chao-Ran
    Shao, Xin-Hui
    IEEE ACCESS, 2021, 9 : 5069 - 5082
  • [5] Cluster-based pattern discrimination: A novel technique for feature selection
    Nanni, L
    PATTERN RECOGNITION LETTERS, 2006, 27 (06) : 682 - 687
  • [6] Probability-Based Synthetic Minority Oversampling Technique
    Altwaijry, Najwa
    IEEE ACCESS, 2023, 11 : 28831 - 28839
  • [7] A Novel Synthetic Minority Oversampling Technique for Multiclass Imbalance Problems
    Wang, Jiao
    Awang, Norhashidah
    IEEE ACCESS, 2025, 13 : 6054 - 6066
  • [8] A novel synthetic minority oversampling technique based on relative and absolute densities for imbalanced classification
    Liu, Ruijuan
    APPLIED INTELLIGENCE, 2023, 53 (01) : 786 - 803
  • [9] SP-SMOTE: A novel space partitioning based synthetic minority oversampling technique
    Li, Yihong
    Wang, Yunpeng
    Li, Tao
    Li, Beibei
    Lan, Xiaolong
    KNOWLEDGE-BASED SYSTEMS, 2021, 228
  • [10] A novel synthetic minority oversampling technique based on relative and absolute densities for imbalanced classification
    Ruijuan Liu
    Applied Intelligence, 2023, 53 : 786 - 803