Imbalanced Data Classification Based on Feature Selection Techniques

被引:10
作者
Ksieniewicz, Pawel [1 ]
Wozniak, Michal [1 ]
机构
[1] Wroclaw Univ Sci & Technol, Dept Syst & Comp Networks, Wroclaw, Poland
来源
INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING (IDEAL 2018), PT II | 2018年 / 11315卷
关键词
Machine learning; Classification; Imbalanced data; Feature selection; Random search;
D O I
10.1007/978-3-030-03496-2_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The difficulty of the many classification tasks lies in the analyzed data nature, as disproportionate number of examples from different class in a learning set. Ignoring this characteristics causes that canonical classifiers display strongly biased performance on imbalanced datasets. In this work a novel classifier ensemble forming technique for imbalanced datasets is presented. On the one hand it takes into consideration selected features used for training individual classifiers, on the other hand it ensures an appropriate diversity of a classifier ensemble. The proposed method was tested on the basis of the computer experiments carried out on the several benchmark datasets. Their results seem to confirm the usefulness of the proposed concept.
引用
收藏
页码:296 / 303
页数:8
相关论文
共 14 条
[1]  
Ahmed F, 2016, 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), P532, DOI 10.1109/BigData.2016.7840644
[2]  
Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
[3]  
[Anonymous], 2004, ACM SIGKDD Explor. Newsl.
[4]   A Survey of Predictive Modeling on Im balanced Domains [J].
Branco, Paula ;
Torgo, Luis ;
Ribeiro, Rita P. .
ACM COMPUTING SURVEYS, 2016, 49 (02)
[5]   A Few Useful Things to Know About Machine Learning [J].
Domingos, Pedro .
COMMUNICATIONS OF THE ACM, 2012, 55 (10) :78-87
[6]  
Du L.-M., 2015, Ann. Data Sci., V2, P293
[7]  
DUDA R. O., 2001, Pattern Classification, V2
[8]  
Guyon I., 2003, INTRO VARIABLE FEATU
[9]   Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines [J].
Maldonado, Sebastian ;
Weber, Richard ;
Famili, Fazel .
INFORMATION SCIENCES, 2014, 286 :228-246
[10]  
Pedregosa F, 2011, J MACH LEARN RES, V12, P2825