Effective Class-Imbalance Learning Based on SMOTE and Convolutional Neural Networks

被引:41
|
作者
Joloudari, Javad Hassannataj [1 ]
Marefat, Abdolreza [2 ]
Nematollahi, Mohammad Ali [3 ]
Oyelere, Solomon Sunday [4 ]
Hussain, Sadiq [5 ]
机构
[1] Univ Birjand, Fac Engn, Dept Comp Engn, Birjand 9717434765, Iran
[2] Islamic Azad Univ, Tech & Engn Fac, Dept Artificial Intelligence, South Tehran Branch, Tehran 1477893780, Iran
[3] Fasa Univ, Dept Comp Sci, Fasa 7461686131, Iran
[4] Lulea Univ Technol, Dept Comp Sci Elect & Space Engn, S-93187 Skelleftea, Sweden
[5] Dibrugarh Univ, Examinat Branch, Dibrugarh 786004, Assam, India
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 06期
关键词
imbalanced data; resampling; normalization; deep neural network; convolutional neural network; CORONARY-ARTERY-DISEASE; CLASSIFICATION; DIAGNOSIS; CLASSIFIERS; ALGORITHMS;
D O I
10.3390/app13064006
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Imbalanced Data (ID) is a problem that deters Machine Learning (ML) models from achieving satisfactory results. ID is the occurrence of a situation where the quantity of the samples belonging to one class outnumbers that of the other by a wide margin, making such models' learning process biased towards the majority class. In recent years, to address this issue, several solutions have been put forward, which opt for either synthetically generating new data for the minority class or reducing the number of majority classes to balance the data. Hence, in this paper, we investigate the effectiveness of methods based on Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) mixed with a variety of well-known imbalanced data solutions meaning oversampling and undersampling. Then, we propose a CNN-based model in combination with SMOTE to effectively handle imbalanced data. To evaluate our methods, we have used KEEL, breast cancer, and Z-Alizadeh Sani datasets. In order to achieve reliable results, we conducted our experiments 100 times with randomly shuffled data distributions. The classification results demonstrate that the mixed Synthetic Minority Oversampling Technique (SMOTE)-Normalization-CNN outperforms different methodologies achieving 99.08% accuracy on the 24 imbalanced datasets. Therefore, the proposed mixed model can be applied to imbalanced binary classification problems on other real datasets.
引用
收藏
页数:34
相关论文
共 50 条
  • [31] Compensating class imbalance for acoustic chimpanzee detection with convolutional recurrent neural networks
    Anders, Franz
    Kalan, Ammie K.
    Kuehl, Hjalmar S.
    Fuchs, Mirco
    ECOLOGICAL INFORMATICS, 2021, 65
  • [32] Weighted Extreme Learning Machine for Dengue Detection with Class-imbalance Classification
    Nadda, Wanchaloem
    Boonchieng, Waraporn
    Boonchieng, Ekkarat
    2019 IEEE HEALTHCARE INNOVATIONS AND POINT OF CARE TECHNOLOGIES (HI-POCT), 2019, : 151 - 154
  • [33] DOS-GAN: A Distributed Over-Sampling Method Based on Generative Adversarial Networks for Distributed Class-Imbalance Learning
    Guan, Hongtao
    Ma, Xingkong
    Shen, Siqi
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT III, 2020, 12454 : 609 - 622
  • [34] Convolutional Neural Network Based Sleep Stage Classification with Class Imbalance
    Xu, Qi
    Zhou, Dongdong
    Wang, Jian
    Shen, Jiangrong
    Kettunen, Lauri
    Cong, Fengyu
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [35] Class-overlap undersampling based on Schur decomposition for Class-imbalance problems
    Dai, Qi
    Liu, Jian-wei
    Shi, Yong-hui
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 221
  • [36] Transfer synthetic over-sampling for class-imbalance learning with limited minority class data
    Xu-Ying Liu
    Sheng-Tao Wang
    Min-Ling Zhang
    Frontiers of Computer Science, 2019, 13 : 996 - 1009
  • [37] Ensemble learning via constraint projection and undersampling technique for class-imbalance problem
    Huaping Guo
    Jun Zhou
    Chang-an Wu
    Soft Computing, 2020, 24 : 4711 - 4727
  • [38] Transfer synthetic over-sampling for class-imbalance learning with limited minority class data
    Liu, Xu-Ying
    Wang, Sheng-Tao
    Zhang, Min-Ling
    FRONTIERS OF COMPUTER SCIENCE, 2019, 13 (05) : 996 - 1009
  • [39] Effective Feature Selection Method for Class-Imbalance Datasets Applied to Chemical Toxicity Prediction
    Antelo-Collado, Aurelio
    Carrasco-Velar, Ramon
    Garcia-Pedrajas, Nicolas
    Cerruela-Garcia, Gonzalo
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2021, 61 (01) : 76 - 94
  • [40] Documenting Evidence of a Reuse of 'A Systematic Study of the Class Imbalance Problem in Convolutional Neural Networks'
    Yedida, Rahul
    Menzies, Tim
    PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), 2021, : 1595 - 1595