GEP-based classifier for mining imbalanced data

被引:12
作者
Jedrzejowicz, Joanna [1 ]
Jedrzejowicz, Piotr [2 ]
机构
[1] Univ Gdansk, Inst Informat, Fac Math Phys & Informat, PL-80308 Gdansk, Poland
[2] Gdynia Maritime Univ, Dept Informat Syst, Morska 83, PL-81225 Gdynia, Poland
关键词
Imbalanced classification; Incremental learning; Gene expression programming; DATA STREAMS; RULES;
D O I
10.1016/j.eswa.2020.114058
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper proposes an incremental Gene Expression Programming classifier for mining imbalanced datasets. Imbalanced datasets are commonly encountered in real-life applications. There exist numerous algorithms, techniques, and tools which are proposed as suitable for dealing with imbalanced class distribution. Yet, none of them seems to be able to outperform all others in all possible applications. We believe that our approach can extend the available range of learners that have proven good performance in mining imbalanced data and imbalanced streams. The idea is to adapt the GEP classifier to requirements of the imbalanced data environment with reuse of the minority class instances, and application of the incremental learning paradigm. The paper offers an overview of the related work and a detailed description of the proposed incremental learner. An extensive computational experiment, based on data from the KEEL dataset repository, proves that in numerous cases the approach is competitive to other state-of-the-art learners.
引用
收藏
页数:10
相关论文
共 50 条
[41]   Research of System Fault Diagnosis Method Based on Imbalanced Data [J].
Zhu, QingYu ;
Liu, Hengyu ;
Wang, Junling ;
Chen, Shaowei ;
Wen, Pengfei ;
Wang, Shengyue .
2019 PROGNOSTICS AND SYSTEM HEALTH MANAGEMENT CONFERENCE (PHM-QINGDAO), 2019,
[42]   Distributed GEP Function Mining on Consistency Merger in Grid Environment [J].
Deng Song ;
Zhang Tao ;
Lin Wei-min ;
Ma Yuan-yuan .
2013 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2013, :376-379
[43]   CS-IBC: Cuckoo search based incremental binary classifier for data streams [J].
Abdualrhman, Mohammed Ahmed Ali ;
Padma, M. C. .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2019, 31 (03) :367-377
[44]   A Study on Imbalanced Data Streams [J].
Aminian, Ehsan ;
Ribeiro, Rita P. ;
Gama, Joao .
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 1168 :380-389
[45]   Incremental Learning Algorithm of Data Complexity Based on KNN Classifier [J].
Li Jie ;
Xue Yaxu ;
Yu Yadong .
2020 INTERNATIONAL SYMPOSIUM ON COMMUNITY-CENTRIC SYSTEMS (CCS), 2020,
[46]   Design of Fuzzy Controller Based on Data Mining [J].
Peng Xia ;
Yuan Yan ;
Cao Weihua ;
Wu Min .
2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, :3602-3606
[47]   Matrix-based dynamic updating rough fuzzy approximations for data mining [J].
Huang, Yanyong ;
Li, Tianrui ;
Luo, Chuan ;
Fujita, Hamido ;
Horng, Shi-jinn .
KNOWLEDGE-BASED SYSTEMS, 2017, 119 :273-283
[48]   Machine learning-based sensitivity of steel frames with highly imbalanced and data [J].
Koh, Hyeyoung ;
Blum, Hannah B. .
ENGINEERING STRUCTURES, 2022, 259
[49]   Classification of imbalanced bioinformatics data by using boundary movement-based ELM [J].
Cheng, Ke ;
Chen, Qingfang ;
Yang, Xibei ;
Gao, Shang ;
Yu, Hualong .
BIO-MEDICAL MATERIALS AND ENGINEERING, 2015, 26 :S1855-S1862
[50]   Ensemble classification for imbalanced data based on feature space partitioning and hybrid metaheuristics [J].
Lopez-Garcia, Pedro ;
Masegosa, Antonio D. ;
Osaba, Eneko ;
Onieva, Enrique ;
Perallos, Asier .
APPLIED INTELLIGENCE, 2019, 49 (08) :2807-2822