Improving Minority Class Recall through a Novel Cluster-Based Oversampling Technique

被引:0
|
作者
Prexawanprasut, Takorn [1 ]
Banditwattanawong, Thepparit [1 ]
机构
[1] Kasetsart Univ, Dept Comp Sci, Krung Thep Maha Nakhon 10900, Thailand
来源
INFORMATICS-BASEL | 2024年 / 11卷 / 02期
关键词
imbalanced data; cluster-based oversampling; SMOTE; inter-cluster confusion; SMOTE;
D O I
10.3390/informatics11020035
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this study, we propose an approach to address the pressing issue of false negative errors by enhancing minority class recall within imbalanced data sets commonly encountered in machine learning applications. Through the utilization of a cluster-based oversampling technique in conjunction with an information entropy evaluation, our approach effectively targets areas of ambiguity inherent in the data set. An extensive evaluation across a diverse range of real-world data sets characterized by inter-cluster complexity demonstrates the superior performance of our method compared to that of existing oversampling techniques. Particularly noteworthy is its significant improvement within the Delinquency Telecom data set, where it achieves a remarkable increase of up to 30.54 percent in minority class recall compared to the original data set. This notable reduction in false negative errors underscores the importance of our methodology in accurately identifying and classifying instances from underrepresented classes, thereby enhancing model performance in imbalanced data scenarios.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] Combining Synthetic Minority Oversampling Technique And Subset Feature Selection Technique For Class Imbalance Problem
    Lachheta, Pawan
    Bawa, Seema
    INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION COMMUNICATION TECHNOLOGY & COMPUTING, 2016, 2016,
  • [22] BO-SMOTE: A Novel Bayesian-Optimization-Based Synthetic Minority Oversampling Technique
    Yan, Shen
    Zhao, Ziyan
    Liu, Shixin
    Zhou, Mengchu
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (04): : 2079 - 2091
  • [23] CCO: A Cluster Core-Based Oversampling Technique for Improved Class-Imbalanced Learning
    Mondal, Priyobrata
    Ansari, Faizanuddin
    Das, Swagatam
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, : 1 - 13
  • [24] A novel oversampling technique for class-imbalanced learning based on SMOTE and natural neighbors
    Li, Junnan
    Zhu, Qingsheng
    Wu, Quanwang
    Fan, Zhu
    INFORMATION SCIENCES, 2021, 565 : 438 - 455
  • [25] ANALYSIS OF A CLASS OF CLUSTER-BASED MULTIPROCESSOR SYSTEMS
    AGRAWAL, DP
    MAHGOUB, IO
    INFORMATION SCIENCES, 1987, 43 (1-2) : 85 - 105
  • [26] Improving the prediction accuracy in blended learning environment using synthetic minority oversampling technique
    Dimic, Gabrijela
    Rancic, Dejan
    Macek, Nemanja
    Spalevic, Petar
    Drasute, Vida
    INFORMATION DISCOVERY AND DELIVERY, 2019, 47 (02) : 76 - 83
  • [27] A novel cluster-based image retrieval
    Lotfy, HM
    Elmaghraby, AS
    Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004, : 338 - 341
  • [28] Synthetic minority oversampling technique based on natural neighborhood graph with subgraph cores for class-imbalanced classification
    Zhao, Ming
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [29] Note on "A Comprehensive Analysis of Synthetic Minority Oversampling Technique (SMOTE) for handling class imbalance"
    Ferrer, Carlos A.
    Aragon, Efren
    INFORMATION SCIENCES, 2023, 630 : 322 - 324
  • [30] Cluster-Based Minority Over-Sampling for Imbalanced Datasets
    Puntumapon, Kamthorn
    Rakthamamon, Thanawin
    Waiyamai, Kitsana
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (12): : 3101 - 3109