Data reduction and stacking for imbalanced data classification

被引:5
作者
Czarnowski, Ireneusz [1 ]
Jedrzejowicz, Piotr [1 ]
机构
[1] Gdynia Maritime Univ, Dept Informat Syst, Morska, Gdynia, Poland
关键词
Instance selection; clustering; stacking; imbalanced data; team of agents; INTEGRATION;
D O I
10.3233/JIFS-179335
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class imbalance arises when the number of examples belonging to one class is much greater than the number of examples belonging to another. The discussed approach focuses on combining several techniques including data reduction and stacking with the aim of improving the performance of the machine classification in the case of imbalanced data. The paper proposes a cluster-based data reduction approach assuming that the instances are selected from a cluster, the data reduction is carried-out on instances belonging to the majority classes, and the aim of the instance selection is to reduce the imbalance ratio between the majority and minority classes. The process of instance selection is carried out with using an agent-based population learning algorithm. To increase performance and generalization ability of the prototype-based machine learning classification it was decided to use the stacking technique. The proposed approach is validated experimentally using several benchmark datasets from the KEEL repository. Advantages and main features of the approach are discussed considering the results of the computational experiment.
引用
收藏
页码:7239 / 7249
页数:11
相关论文
共 27 条
[1]  
Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
[2]  
[Anonymous], 1996, 185996 EDRC CARN MEL
[3]  
[Anonymous], 2009, SAMPLING DESIGN ANAL
[4]   Adaptive integrated image segmentation and object recognition [J].
Bhanu, B ;
Peng, J .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2000, 30 (04) :427-441
[5]  
Czarnowski I., 2018, COMPLEXITY
[6]   An Approach to Data Reduction for Learning from Big Datasets: Integrating Stacking, Rotation, and Agent Population Learning Techniques [J].
Czarnowski, Ireneusz ;
Jedrzejowicz, Piotr .
COMPLEXITY, 2018,
[7]   Cluster-Based Instance Selection for the Imbalanced Data Classification [J].
Czarnowski, Ireneusz ;
Jedrzejowicz, Piotr .
COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2018, PT II, 2018, 11056 :191-200
[8]  
Czarnowski I, 2011, LECT NOTES ARTIF INT, V6682, P436, DOI 10.1007/978-3-642-22000-5_45
[9]  
Czarnowski I, 2011, LECT NOTES COMPUT SC, V6660, P3, DOI 10.1007/978-3-642-21884-2_1
[10]  
Czarnowski I, 2010, LECT NOTES ARTIF INT, V6421, P353, DOI 10.1007/978-3-642-16693-8_37