A genetic algorithm-based approach for class-imbalanced learning

被引:0
作者
Dong, Shangyan [1 ]
Wu, Yongcheng [1 ]
机构
[1] Jingchu Univ Technol, 33 Xiangshan Rd, Jingmen 448000, Hubei, Peoples R China
来源
THIRD INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION | 2018年 / 10828卷
关键词
Class-imbalance datasets; over-sampling; genetic algorithm; F-measure;
D O I
10.1117/12.2501764
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
It is often the case for machine learning that datasets are imbalanced in the real world. When dealing with this problem, the traditional classification method aiming to maximize the overall accuracy of classification is not suitable. To tackle this issue and improve the performance of classifiers, methods based on oversampling, undersampling and cost-sensitive classification are widely employed. In this paper, we propose a new genetic algorithm-based over-sampling technique for class-imbalanced datasets. The genetic algorithm can create optimized synthetic minority class instances to produce a balanced training datasets. The experimental results on 5 class-imbalanced datasets show that our method performs better than three existing sampling techniques in terms of AUC and F-measure.
引用
收藏
页数:7
相关论文
共 18 条
[1]  
[Anonymous], 2005, DATA MINING
[2]  
[Anonymous], 2003, HP INVENT
[3]   MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning [J].
Barua, Sukarna ;
Islam, Md. Monirul ;
Yao, Xin ;
Murase, Kazuyuki .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) :405-425
[4]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[5]   SMOTEBoost: Improving prediction of the minority class in boosting [J].
Chawla, NV ;
Lazarevic, A ;
Hall, LO ;
Bowyer, KW .
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 :107-119
[6]  
García S, 2006, LECT NOTES COMPUT SC, V4224, P1415
[7]  
Goldberg D.E., 1991, Urbana, V51, P61801
[8]  
Japkowicz N., 2002, Intelligent Data Analysis, V6, P429
[9]  
Kim M.s., 2007, Proceedings of the 8th Symposium on Advanced Intelligent Systems, P825
[10]   Cost-sensitive decision tree ensembles for effective imbalanced classification [J].
Krawczyk, Bartosz ;
Wozniak, Michal ;
Schaefer, Gerald .
APPLIED SOFT COMPUTING, 2014, 14 :554-562