CLUSTERING-BASED SUBSET ENSEMBLE LEARNING METHOD FOR IMBALANCED DATA

被引：0

作者：

Hu, Xiao-Sheng ^{[1
]}

Zhang, Run-Jing ^{[2
]}

机构：

[1] Foshan Univ, Coll Elect & Informat Engn, Foshan 528000, Peoples R China

[2] Foshan Univ, Informat & Educ Technol Ctr, Foshan 528000, Peoples R China

来源：

PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4 | 2013年

关键词：

Imbalanced data; Classification; Clustering; Ensemble learning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent research, classification involving imbalanced datasets has received considerable attention. Most classification algorithms tend to predict that most of the incoming data belongs to the majority class, resulting in the poor classification performance in minority class instances, which are usually of much more interest. In this paper we propose a clustering-based subset ensemble learning method for handling class imbalanced problem. In the proposed approach, first, new balanced training datasets are produced using clustering-based under-sampling, then, further classification of new training sets are performed by applying four algorithms: Decision Tree, Naive Bayes, KNN and SVM, as the base algorithms in combined-bagging. An experimental analysis is carried out over a wide range of highly imbalanced data sets. The results obtained show that our method can improve imbalance classification performance of rare and normal classes stably and effectively.

引用

页码：35 / 39

页数：5

共 50 条

[41] Clustering-based improved adaptive synthetic minority oversampling technique for imbalanced data classification
Jin, Dian
Xie, Dehong
Liu, Di
Gong, Murong
INTELLIGENT DATA ANALYSIS, 2023, 27 (03) : 635 - 652
[42] Clustering-based selective neural network ensemble
Fu Q.
Hu S.-X.
Zhao S.-Y.
Journal of Zhejiang University-SCIENCE A, 2005, 6 (5): : 387 - 392
[43] GIR-based canonical forest: An ensemble method for imbalanced big data
Han, Solji
Myung, Jaesung
Kim, Hyunjoong
KOREAN JOURNAL OF APPLIED STATISTICS, 2024, 37 (05)
[44] Logistic regression for imbalanced learning based on clustering
Guo, Huaping
Wei, Tao
INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2019, 18 (01) : 54 - 64
[45] EMRIL: Ensemble Method based on ReInforcement Learning for binary classification in imbalanced drifting data streams
Usman, Muhammad
Chen, Huanhuan
NEUROCOMPUTING, 2024, 605
[46] A combination of clustering-based under-sampling with ensemble methods for solving imbalanced class problem in intelligent systems
Shahabadi, Mohammad Saleh Ebrahimi
Tabrizchi, Hamed
Rafsanjani, Marjan Kuchaki
Gupta, B. B.
Palmieri, Francesco
TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2021, 169
[47] A Novel Ensemble-Learning-Based Convolution Neural Network for Handling Imbalanced Data
Wu, Xianbin
Wen, Chuanbo
Wang, Zidong
Liu, Weibo
Yang, Junjie
COGNITIVE COMPUTATION, 2024, 16 (01) : 177 - 190
[48] A Novel Ensemble-Learning-Based Convolution Neural Network for Handling Imbalanced Data
Xianbin Wu
Chuanbo Wen
Zidong Wang
Weibo Liu
Junjie Yang
Cognitive Computation, 2024, 16 : 177 - 190
[49] A Novel Ensemble Learning Paradigm for Medical Diagnosis With Imbalanced Data
Liu, Na
Li, Xiaomei
Qi, Ershi
Xu, Man
Li, Ling
Gao, Bo
IEEE ACCESS, 2020, 8 : 171263 - 171280
[50] A Clustering-Based Deep Learning Method for Water Level Prediction
Wang, Chih-Ping
Liu, Duen-Ren
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107 (12) : 1538 - 1541

← 1 2 3 4 5 →