GEP-based classifier for mining imbalanced data

被引:12
作者
Jedrzejowicz, Joanna [1 ]
Jedrzejowicz, Piotr [2 ]
机构
[1] Univ Gdansk, Inst Informat, Fac Math Phys & Informat, PL-80308 Gdansk, Poland
[2] Gdynia Maritime Univ, Dept Informat Syst, Morska 83, PL-81225 Gdynia, Poland
关键词
Imbalanced classification; Incremental learning; Gene expression programming; DATA STREAMS; RULES;
D O I
10.1016/j.eswa.2020.114058
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper proposes an incremental Gene Expression Programming classifier for mining imbalanced datasets. Imbalanced datasets are commonly encountered in real-life applications. There exist numerous algorithms, techniques, and tools which are proposed as suitable for dealing with imbalanced class distribution. Yet, none of them seems to be able to outperform all others in all possible applications. We believe that our approach can extend the available range of learners that have proven good performance in mining imbalanced data and imbalanced streams. The idea is to adapt the GEP classifier to requirements of the imbalanced data environment with reuse of the minority class instances, and application of the incremental learning paradigm. The paper offers an overview of the related work and a detailed description of the proposed incremental learner. An extensive computational experiment, based on data from the KEEL dataset repository, proves that in numerous cases the approach is competitive to other state-of-the-art learners.
引用
收藏
页数:10
相关论文
共 50 条
[31]   OEC: an online ensemble classifier for mining data streams with noisy labels [J].
Jian, Ling ;
Shao, Kai ;
Liu, Ying ;
Li, Jundong ;
Liang, Xijun .
DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 38 (03) :1101-1124
[32]   A combined SMOTE and PSO based RBF classifier for two-class imbalanced problems [J].
Gao, Ming ;
Hong, Xia ;
Chen, Sheng ;
Harris, Chris J. .
NEUROCOMPUTING, 2011, 74 (17) :3456-3466
[33]   GEP-based models for estimating the elastic shear buckling and ultimate loads of cold-formed steel channels with staggered slotted web perforations in shear [J].
Ipek, Sueleyman ;
Degtyarev, Vitaliy V. ;
Guneyisi, Esra Mete ;
Mansouri, Iman .
STRUCTURES, 2022, 46 :186-200
[34]   Detecting Crowdfunding Frauds Based on Textual and Imbalanced Data [J].
Xu C. ;
Zhang W. .
Data Analysis and Knowledge Discovery, 2023, 7 (09) :125-135
[35]   A hybrid imbalanced classification model based on data density [J].
Shi, Shengnan ;
Li, Jie ;
Zhu, Dan ;
Yang, Fang ;
Xu, Yong .
INFORMATION SCIENCES, 2023, 624 :50-67
[36]   Incremental weighted one-class classifier for mining stationary data streams [J].
Krawczyk, Bartosz ;
Wozniak, Michal .
JOURNAL OF COMPUTATIONAL SCIENCE, 2015, 9 :19-25
[37]   PSO-based Constrained Imbalanced Data Classification [J].
Hlosta, Martin ;
Striz, Rostislav ;
Zendulka, Jaroslav ;
Hruska, Tomas .
INFORMATICS 2013: PROCEEDINGS OF THE TWELFTH INTERNATIONAL CONFERENCE ON INFORMATICS, 2013, :234-239
[38]   Sentiment classification based on weak tagging information and imbalanced data [J].
Wang, Chuantao ;
Yang, Xuexin ;
Ding, Linkai .
INTELLIGENT DATA ANALYSIS, 2021, 25 (03) :555-570
[39]   Clustering-based incremental learning for imbalanced data classification [J].
Liu, Yuxin ;
Du, Guangyu ;
Yin, Chenke ;
Zhang, Haichao ;
Wang, Jia .
KNOWLEDGE-BASED SYSTEMS, 2024, 292
[40]   Research of System Fault Diagnosis Method Based on Imbalanced Data [J].
Zhu, QingYu ;
Liu, Hengyu ;
Wang, Junling ;
Chen, Shaowei ;
Wen, Pengfei ;
Wang, Shengyue .
2019 PROGNOSTICS AND SYSTEM HEALTH MANAGEMENT CONFERENCE (PHM-QINGDAO), 2019,