Effective Class-Imbalance Learning Based on SMOTE and Convolutional Neural Networks

被引:41
|
作者
Joloudari, Javad Hassannataj [1 ]
Marefat, Abdolreza [2 ]
Nematollahi, Mohammad Ali [3 ]
Oyelere, Solomon Sunday [4 ]
Hussain, Sadiq [5 ]
机构
[1] Univ Birjand, Fac Engn, Dept Comp Engn, Birjand 9717434765, Iran
[2] Islamic Azad Univ, Tech & Engn Fac, Dept Artificial Intelligence, South Tehran Branch, Tehran 1477893780, Iran
[3] Fasa Univ, Dept Comp Sci, Fasa 7461686131, Iran
[4] Lulea Univ Technol, Dept Comp Sci Elect & Space Engn, S-93187 Skelleftea, Sweden
[5] Dibrugarh Univ, Examinat Branch, Dibrugarh 786004, Assam, India
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 06期
关键词
imbalanced data; resampling; normalization; deep neural network; convolutional neural network; CORONARY-ARTERY-DISEASE; CLASSIFICATION; DIAGNOSIS; CLASSIFIERS; ALGORITHMS;
D O I
10.3390/app13064006
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Imbalanced Data (ID) is a problem that deters Machine Learning (ML) models from achieving satisfactory results. ID is the occurrence of a situation where the quantity of the samples belonging to one class outnumbers that of the other by a wide margin, making such models' learning process biased towards the majority class. In recent years, to address this issue, several solutions have been put forward, which opt for either synthetically generating new data for the minority class or reducing the number of majority classes to balance the data. Hence, in this paper, we investigate the effectiveness of methods based on Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) mixed with a variety of well-known imbalanced data solutions meaning oversampling and undersampling. Then, we propose a CNN-based model in combination with SMOTE to effectively handle imbalanced data. To evaluate our methods, we have used KEEL, breast cancer, and Z-Alizadeh Sani datasets. In order to achieve reliable results, we conducted our experiments 100 times with randomly shuffled data distributions. The classification results demonstrate that the mixed Synthetic Minority Oversampling Technique (SMOTE)-Normalization-CNN outperforms different methodologies achieving 99.08% accuracy on the 24 imbalanced datasets. Therefore, the proposed mixed model can be applied to imbalanced binary classification problems on other real datasets.
引用
收藏
页数:34
相关论文
共 50 条
  • [41] Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks
    Shen, Li
    Lin, Zhouchen
    Huang, Qingming
    COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 : 467 - 482
  • [42] Class-Incremental Learning of Convolutional Neural Networks Based on Double Consolidation Mechanism
    Jin, Leilei
    Liang, Hong
    Yang, Changsheng
    IEEE ACCESS, 2020, 8 : 172553 - 172562
  • [43] Sampled Bayesian Network Classifiers for Class-Imbalance and Cost-Sensitive Learning
    Jiang, Liangxiao
    Li, Chaoqun
    Cai, Zhihua
    Zhang, Harry
    2013 IEEE 25TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2013, : 512 - 517
  • [44] Ensemble learning via constraint projection and undersampling technique for class-imbalance problem
    Guo, Huaping
    Zhou, Jun
    Wu, Chang-An
    SOFT COMPUTING, 2020, 24 (07) : 4711 - 4727
  • [45] Using Gabriel graphs in Borderline-SMOTE to deal with severe two-class imbalance problems on neural networks
    Toribio, P.
    Alejo, R.
    Valdovinos, R. M.
    Pacheco-Sanchez, J. H.
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2012, 248 : 29 - +
  • [46] Learning from class-imbalance and heterogeneous data for 30-day hospital readmission
    Du, Guodong
    Zhang, Jia
    Li, Shaozi
    Li, Candong
    NEUROCOMPUTING, 2021, 420 : 27 - 35
  • [47] APCNN: Tackling Class Imbalance in Relation Extraction through Aggregated Piecewise Convolutional Neural Networks
    Smirnova, Alisa
    Audiffren, Julien
    Cudre-Mauroux, Philippe
    2019 6TH SWISS CONFERENCE ON DATA SCIENCE (SDS), 2019, : 63 - 68
  • [48] Effects of Topological Factors and Class Imbalance on Node Classification Through Graph Convolutional Neural Networks
    Parlanti, Tatiana S.
    Catania, Carlos A.
    Moyano, Luis G.
    COMPUTER SCIENCE-CACIC 2023, 2024, 2123 : 213 - 226
  • [49] Identification of small open reading frames in plant lncRNA using class-imbalance learning
    Zhao, Siyuan
    Meng, Jun
    Wekesa, Jael Sanyanda
    Luan, Yushi
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 157
  • [50] On Class-Incremental Learning for Fully Binarized Convolutional Neural Networks
    Basso-Bert, Yanis
    Guiequero, William
    Molnos, Anca
    Lemaire, Romain
    Dupret, Antoine
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,