Adaptive semi-unsupervised weighted oversampling (A-SUWO) for imbalanced datasets

被引：193

作者：

Nekooeimehr, Iman ^{[1
]}

Lai-Yuen, Susana K. ^{[1
]}

机构：

[1] Univ S Florida, Ind & Management Syst Engn, Tampa, FL 33620 USA

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2016年 / 46卷

关键词：

Imbalanced dataset; Classification; Clustering; Oversampling; PERFORMANCE;

D O I：

10.1016/j.eswa.2015.10.031

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In many applications, the dataset for classification may be highly imbalanced where most of the instances in the training set may belong to one of the classes (majority class), while only a few instances are from the other class (minority class). Conventional classifiers will strongly favor the majority class and ignore the minority instances. In this paper, we present a new oversampling method called Adaptive Semi-Unsupervised Weighted Oversampling (A-SUWO) for imbalanced binary dataset classification. The proposed method clusters the minority instances using a semi-unsupervised hierarchical clustering approach and adaptively determines the size to oversample each sub-cluster using its classification complexity and cross validation. Then, the minority instances are oversampled depending on their Euclidean distance to the majority class. A-SUWO aims to identify hard-to-learn instances by considering minority instances from each sub-cluster that are closer to the borderline. It also avoids generating synthetic minority instances that overlap with the majority class by considering the majority class in the clustering and oversampling stages. Results demonstrate that the proposed method achieves significantly better results in most datasets compared with other sampling methods. (C) 2015 Elsevier Ltd. All rights reserved.

引用

页码：405 / 416

页数：12

共 13 条

[1] IA-SUWO: An Improving Adaptive semi-unsupervised weighted oversampling for imbalanced classification problems
Wei Jianan
Huang Haisong
Yao Liguo
Hu Yao
Fan Qingsong
Huang Dong
KNOWLEDGE-BASED SYSTEMS, 2020, 203
[2] Improved Adaptive Semi-Unsupervised Weighted Oversampling using Sparsity Factor for Imbalanced Datasets
Ali, Haseeb
Salleh, Mohd Najib Mohd
Hussain, Kashif
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (11) : 372 - 383
[3] An Adaptive Oversampling Technique for Imbalanced Datasets
Shahee, Shaukat Ali
Ananthakumar, Usha
ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS (ICDM 2018), 2018, 10933 : 1 - 16
[4] A novel adaptive boundary weighted and synthetic minority oversampling algorithm for imbalanced datasets
Song, Xudong
Chen, Yilin
Liang, Pan
Wan, Xiaohui
Cui, Yunxian
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 3245 - 3259
[5] AWGAN: An adaptive weighting GAN approach for oversampling imbalanced datasets
Guan, Shaopeng
Zhao, Xiaoyan
Xue, Yuewei
Pan, Hao
INFORMATION SCIENCES, 2024, 663
[6] Self-adaptive oversampling method based on the complexity of minority data in imbalanced datasets classification
Tao, Xinmin
Guo, Xinyue
Zheng, Yujia
Zhang, Xiaohan
Chen, Zhiyu
KNOWLEDGE-BASED SYSTEMS, 2023, 277
[7] An Adaptive and Robust Method for Oriented Oversampling With Spatial Information for Imbalanced Noisy Datasets
Deng, Yi
Li, Mingyong
IEEE ACCESS, 2023, 11 : 122610 - 122624
[8] Adaptive weighted over-sampling for imbalanced datasets based on density peaks clustering with heuristic filtering
Tao, Xinmin
Li, Qing
Guo, Wenjie
Ren, Chao
He, Qing
Liu, Rui
Zou, JunRong
INFORMATION SCIENCES, 2020, 519 : 43 - 73
[9] Development of a Neighborhood Based Adaptive Heterogeneous Oversampling Ensemble Classifier for Imbalanced Binary Class Datasets
Subbulaxmi, S. Santha
Arumugam, G.
PERVASIVE COMPUTING AND SOCIAL NETWORKING, ICPCSN 2022, 2023, 475 : 353 - 361
[10] NCLWO: Newton's cooling law-based weighted oversampling algorithm for imbalanced datasets with feature noise
Tao, Liangliang
Wang, Qingya
Zhu, Zhicheng
Yu, Fen
Yin, Xia
NEUROCOMPUTING, 2024, 610

← 1 2 →