Comparison Of The Different Sampling Techniques For Imbalanced Classification Problems In Machine Learning

被引:5
作者
Peng Zhihao [1 ]
Yan Fenglong [1 ]
Li Xucheng [1 ]
机构
[1] Dalian Neusoft Univ Informat, Sch Comp & Software, Dalian 116626, Peoples R China
来源
2019 11TH INTERNATIONAL CONFERENCE ON MEASURING TECHNOLOGY AND MECHATRONICS AUTOMATION (ICMTMA 2019) | 2019年
关键词
Machine Learning; Imbalanced Classification; Datasets;
D O I
10.1109/ICMTMA.2019.00101
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Imbalanced class distribution is a scenario where the number of observations belonging to one class is significantly lower than those belonging to the other ones. Machine learning algorithms are often designed to improve accuracy by reducing the errors. Thus, they do not consider the class distribution proportion or the balance of classes. In this paper, firstly, we describes the various approaches for solving such class imbalance problems, using various sampling techniques. Then we weigh each technique for its pros and cons. Finally, an approach purpose is revealed in which you can create a balanced class distribution and apply ensemble learning technique designed especially for imbalanced class distribution.
引用
收藏
页码:431 / 434
页数:4
相关论文
共 6 条
[1]  
[Anonymous], 2007, ICML
[2]  
Chawla Nitesh V., LEARNING IMBALANCED, V6
[3]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[4]   A multiple resampling method for learning from imbalanced data sets [J].
Estabrooks, A ;
Jo, TH ;
Japkowicz, N .
COMPUTATIONAL INTELLIGENCE, 2004, 20 (01) :18-36
[5]  
KrishnaVeni C.V., 2011, IJCST, V2
[6]  
Vajda Szilard, 12 INT C FRONT HANDW