Cluster-based Weighted Oversampling for Ordinal Regression (CWOS-Ord)

被引：11

作者：

Nekooeimehr, Iman ^{[1
]}

Lai-Yuen, Susana K. ^{[1
]}

机构：

[1] Univ S Florida, Ind & Management Syst Engn, 4202 East Fowler Ave,ENB 118, Tampa, FL 33620 USA

来源：

NEUROCOMPUTING | 2016年 / 218卷

关键词：

Imbalanced dataset; Ordinal regression; Clustering; Oversampling; CLASSIFICATION; IMBALANCE; ALGORITHM; SMOTE;

D O I：

10.1016/j.neucom.2016.08.071

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A new oversampling method called Cluster-based Weighted Oversampling for Ordinal Regression (CWOS-Ord) is proposed for addressing ordinal regression with unbalanced datasets. Ordinal regression is a supervised approach for learning the ordinal relationship between classes. In many applications, the dataset is highly imbalanced where the instances of some classes (majority classes) occur much more frequently than instances of other classes (minority classes). This significantly degrades the classification performance as classifiers tend to strongly favor the majority classes. Standard oversampling methods can be used to improve the dataset class distribution; however, they do not consider the ordinal relationship between the classes. The proposed CWOS-Ord method aims to address this problem by first clustering minority classes and then oversampling them based on their distances and ordering relationship to other classes' instances. The final size to oversample the clusters depends on their complexity and their initial size so that more synthetic instances are generated for more complex and smaller clusters while fewer instances are generated for less complex and larger clusters. As a secondary contribution, existing oversampling methods for two-class classification have been extended for ordinal regression. Results demonstrate that the proposed CWOS-Ord method provides significantly better results compared to other methods based on the performance measures. (C) 2016 Elsevier B.V. All rights reserved.

引用

页码：51 / 60

页数：10

共 42 条

[1]

[Anonymous], 2004, ACM SIGKDD EXPLORATI, DOI DOI 10.1145/1007730.1007737

[2] An experimental study on evolutionary fuzzy classifiers designed for managing imbalanced datasets [J].