GEP-based classifier for mining imbalanced data

被引:12
作者
Jedrzejowicz, Joanna [1 ]
Jedrzejowicz, Piotr [2 ]
机构
[1] Univ Gdansk, Inst Informat, Fac Math Phys & Informat, PL-80308 Gdansk, Poland
[2] Gdynia Maritime Univ, Dept Informat Syst, Morska 83, PL-81225 Gdynia, Poland
关键词
Imbalanced classification; Incremental learning; Gene expression programming; DATA STREAMS; RULES;
D O I
10.1016/j.eswa.2020.114058
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper proposes an incremental Gene Expression Programming classifier for mining imbalanced datasets. Imbalanced datasets are commonly encountered in real-life applications. There exist numerous algorithms, techniques, and tools which are proposed as suitable for dealing with imbalanced class distribution. Yet, none of them seems to be able to outperform all others in all possible applications. We believe that our approach can extend the available range of learners that have proven good performance in mining imbalanced data and imbalanced streams. The idea is to adapt the GEP classifier to requirements of the imbalanced data environment with reuse of the minority class instances, and application of the incremental learning paradigm. The paper offers an overview of the related work and a detailed description of the proposed incremental learner. An extensive computational experiment, based on data from the KEEL dataset repository, proves that in numerous cases the approach is competitive to other state-of-the-art learners.
引用
收藏
页数:10
相关论文
共 50 条
[21]   A new MapReduce associative classifier based on a new storage format for large-scale imbalanced data [J].
Almasi, Mehrdad ;
Abadeh, Mohammad Saniee .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2018, 21 (04) :1821-1847
[22]   A GEP-based reactive scheduling policies constructing approach for dynamic flexible job shop scheduling problem with job release dates [J].
Li Nie ;
Liang Gao ;
Peigen Li ;
Xinyu Li .
Journal of Intelligent Manufacturing, 2013, 24 :763-774
[23]   Employee attrition prediction for imbalanced data using genetic algorithm-based parameter optimization of XGB Classifier [J].
Konar, Karabi ;
Das, Saptarshi ;
Das, Samiran .
2023 INTERNATIONAL CONFERENCE ON COMPUTER, ELECTRICAL & COMMUNICATION ENGINEERING, ICCECE, 2023,
[24]   A GEP-based reactive scheduling policies constructing approach for dynamic flexible job shop scheduling problem with job release dates [J].
Nie, Li ;
Gao, Liang ;
Li, Peigen ;
Li, Xinyu .
JOURNAL OF INTELLIGENT MANUFACTURING, 2013, 24 (04) :763-774
[25]   Sparse Supervised Representation-Based Classifier for Uncontrolled and Imbalanced Classification [J].
Shu, Ting ;
Zhang, Bob ;
Tang, Yuan Yan .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (08) :2847-2856
[26]   Employing One-Class SVM Classifier Ensemble for Imbalanced Data Stream Classification [J].
Klikowski, Jakub ;
Wozniak, Michal .
COMPUTATIONAL SCIENCE - ICCS 2020, PT IV, 2020, 12140 :117-127
[27]   An adaptive ensemble classifier for mining concept drifting data streams [J].
Farid, Dewan Md. ;
Zhang, Li ;
Hossain, Alamgir ;
Rahman, Chowdhury Mofizur ;
Strachan, Rebecca ;
Sexton, Graham ;
Dahal, Keshav .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (15) :5895-5906
[28]   A novel dynamic ensemble selection classifier for an imbalanced data set: An application for credit risk assessment [J].
Hou, Wen-hui ;
Wang, Xiao-kang ;
Zhang, Hong-yu ;
Wang, Jian-qiang ;
Li, Lin .
KNOWLEDGE-BASED SYSTEMS, 2020, 208 (208)
[29]   An Adaptive Ensemble Classifier for Mining Complex Noisy Instances in Data Streams [J].
Karim, Md Rejaul ;
Farid, Dewan Md .
2014 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2014,
[30]   OEC: an online ensemble classifier for mining data streams with noisy labels [J].
Ling Jian ;
Kai Shao ;
Ying Liu ;
Jundong Li ;
Xijun Liang .
Data Mining and Knowledge Discovery, 2024, 38 :1101-1124