A Cluster-Based Under-Sampling Algorithm for Class-Imbalanced Data

被引:0
|
作者
Guzman-Ponce, A. [1 ,2 ]
Valdovinos, R. M. [1 ]
Sanchez, J. S. [2 ]
机构
[1] Univ Autonoma Estado Mexico, Fac Ingn, Toluca, Mexico
[2] Univ Jaume 1, Inst New Imaging Technol, Dept Comp Languages & Syst, Castellon de La Plana, Spain
关键词
Class imbalance; DBSCAN; Under-sampling; Noise filtering;
D O I
10.1007/978-3-030-61705-9_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The resampling methods are among the most popular strategies to face the class imbalance problem. The objective of these methods is to compensate the imbalanced class distribution by over-sampling the minority class and/or under-sampling the majority class. In this paper, a new under-sampling method based on the DBSCAN clustering algorithm is introduced. The main idea is to remove the majority class instances that are identified as noise by DBSCAN. The proposed method is empirically compared to well-known state-of-the-art under-sampling algorithms over 25 benchmarking databases and the experimental results demonstrate the effectiveness of the new method in terms of sensitivity, specificity, and geometric mean of individual accuracies.
引用
收藏
页码:299 / 311
页数:13
相关论文
共 50 条
  • [21] A cluster-based SMOTE both-sampling (CSBBoost) ensemble algorithm for classifying imbalanced data
    Amir Reza Salehi
    Majid Khedmati
    Scientific Reports, 14
  • [22] A cluster-based SMOTE both-sampling (CSBBoost) ensemble algorithm for classifying imbalanced data
    Salehi, Amir Reza
    Khedmati, Majid
    SCIENTIFIC REPORTS, 2024, 14 (01)
  • [23] Improving Speller BCI performance using a cluster-based under-sampling method
    Cortez, Sergio A.
    Flores, Christian
    Andreu-Perez, Javier
    2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 576 - 581
  • [24] An Under-Sampling Method with Support Vectors in Multi-class Imbalanced Data Classification
    Arafat, Md. Yasir
    Hoque, Sabera
    Xu, Shuxiang
    Farid, Dewan Md.
    2019 13TH INTERNATIONAL CONFERENCE ON SOFTWARE, KNOWLEDGE, INFORMATION MANAGEMENT AND APPLICATIONS (SKIMA), 2019,
  • [25] Two-step ensemble under-sampling algorithm for massive imbalanced data classification
    Bai, Lin
    Ju, Tong
    Wang, Hao
    Lei, Mingzhu
    Pan, Xiaoying
    INFORMATION SCIENCES, 2024, 665
  • [26] Evolutionary under-sampling based bagging ensemble method for imbalanced data classification
    Sun, Bo
    Chen, Haiyan
    Wang, Jiandong
    Xie, Hua
    FRONTIERS OF COMPUTER SCIENCE, 2018, 12 (02) : 331 - 350
  • [27] Evolutionary under-sampling based bagging ensemble method for imbalanced data classification
    Bo Sun
    Haiyan Chen
    Jiandong Wang
    Hua Xie
    Frontiers of Computer Science, 2018, 12 : 331 - 350
  • [28] A Hybrid Under-Sampling Method (HUSBoost) to Classify Imbalanced Data
    Popel, Mahmudul Hasan
    Hasib, Khan Md
    Habib, Syed Ahsan
    Shah, Faisal Muhammad
    2018 21ST INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2018,
  • [29] A Scalable Exemplar-Based Subspace Clustering Algorithm for Class-Imbalanced Data
    You, Chong
    Li, Chi
    Robinson, Daniel P.
    Vidal, Rene
    COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 : 68 - 85
  • [30] Under-sampling approaches for improving prediction of the minority class in an imbalanced dataset
    Yen, Show-Jane
    Lee, Yue-Shi
    INTELLIGENT CONTROL AND AUTOMATION, 2006, 344 : 731 - 740