A Selective Dynamic Sampling Back-Propagation Approach for Handling the Two-Class Imbalance Problem

被引:8
作者
Alejo, Roberto [1 ]
Monroy-de-Jesus, Juan [2 ]
Pacheco-Sanchez, Juan H. [3 ]
Lopez-Gonzalez, Erika [1 ]
Antonio-Velazquez, Juan A. [1 ]
机构
[1] Tecnol Estudios Super Jocotitlan, Pattern Recognit Lab, Carretera Toluca Atlacomulco KM 44-8, Jocotitlan 50700, Mexico
[2] Univ Autonoma Estado Mexico, Comp Sci, Carretera Toluca Atlacomulco KM 60, Atlacomulco 50000, Mexico
[3] Inst Tecnol Toluca, Div Grad Studies & Res, Av Tecnol S-N, Metepec 52149, Edo De Mexico, Mexico
来源
APPLIED SCIENCES-BASEL | 2016年 / 6卷 / 07期
关键词
two-class imbalance problem; average samples; over-sampling; under-sampling; dynamic sampling; NEURAL-NETWORKS; CLASSIFICATION; CLASSIFIERS; SMOTE;
D O I
10.3390/app6070200
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In this work, we developed a Selective Dynamic Sampling Approach (SDSA) to deal with the class imbalance problem. It is based on the idea of using only the most appropriate samples during the neural network training stage. The average samplesare the best to train the neural network, they are neither hard, nor easy to learn, and they could improve the classifier performance. The experimental results show that the proposed method is a successful method to deal with the two-class imbalance problem. It is very competitive with respect to well-known over-sampling approaches and dynamic sampling approaches, even often outperforming the under-sampling and standard back-propagation methods. SDSA is a very simple method for automatically selecting the most appropriate samples (average samples) during the training of the back-propagation, and it is very efficient. In the training stage, SDSA uses significantly fewer samples than the popular over-sampling approaches and even than the standard back-propagation trained with the original dataset.
引用
收藏
页数:17
相关论文
共 59 条
[1]  
Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
[2]   An Efficient Over-sampling Approach Based on Mean Square Error Back-propagation for Dealing with the Multi-class Imbalance Problem [J].
Alejo, R. ;
Garcia, V. ;
Pacheco-Sanchez, J. H. .
NEURAL PROCESSING LETTERS, 2015, 42 (03) :603-617
[3]   A hybrid method to face class overlap and class imbalance on neural networks and multi-class scenarios [J].
Alejo, R. ;
Valdovinos, R. M. ;
Garcia, V. ;
Pacheco-Sanchez, J. H. .
PATTERN RECOGNITION LETTERS, 2013, 34 (04) :380-388
[4]   Analysing the Safe, Average and Border Samples on Two-Class Imbalance Problems in the Back-Propagation Domain [J].
Alejo, Roberto ;
Monroy-de-Jesus, Juan ;
Horacio Pacheco-Sanchez, J. ;
Maria Valdovinos, Rosa ;
Antonio-Velazquez, Juan A. ;
Raymundo Marcial-Romero, J. .
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2015, 2015, 9423 :699-707
[5]  
[Anonymous], IEEE T SYST MAN CYBE
[6]  
[Anonymous], 2004, ACM SIGKDD EXPLORATI, DOI DOI 10.1145/1007730.1007737
[7]  
[Anonymous], 2015, 220 BAND AVIRIS HYPE
[8]  
[Anonymous], 1997, P 14 INT C ONMACHINE
[9]  
[Anonymous], 2007, P 16 ACM C INF KNOWL
[10]  
Batista GE., 2004, ACM SIGKDD EXPL NEWS, V6, P20, DOI DOI 10.1145/1007730.1007735