Effective Class-Imbalance Learning Based on SMOTE and Convolutional Neural Networks

被引:41
|
作者
Joloudari, Javad Hassannataj [1 ]
Marefat, Abdolreza [2 ]
Nematollahi, Mohammad Ali [3 ]
Oyelere, Solomon Sunday [4 ]
Hussain, Sadiq [5 ]
机构
[1] Univ Birjand, Fac Engn, Dept Comp Engn, Birjand 9717434765, Iran
[2] Islamic Azad Univ, Tech & Engn Fac, Dept Artificial Intelligence, South Tehran Branch, Tehran 1477893780, Iran
[3] Fasa Univ, Dept Comp Sci, Fasa 7461686131, Iran
[4] Lulea Univ Technol, Dept Comp Sci Elect & Space Engn, S-93187 Skelleftea, Sweden
[5] Dibrugarh Univ, Examinat Branch, Dibrugarh 786004, Assam, India
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 06期
关键词
imbalanced data; resampling; normalization; deep neural network; convolutional neural network; CORONARY-ARTERY-DISEASE; CLASSIFICATION; DIAGNOSIS; CLASSIFIERS; ALGORITHMS;
D O I
10.3390/app13064006
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Imbalanced Data (ID) is a problem that deters Machine Learning (ML) models from achieving satisfactory results. ID is the occurrence of a situation where the quantity of the samples belonging to one class outnumbers that of the other by a wide margin, making such models' learning process biased towards the majority class. In recent years, to address this issue, several solutions have been put forward, which opt for either synthetically generating new data for the minority class or reducing the number of majority classes to balance the data. Hence, in this paper, we investigate the effectiveness of methods based on Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) mixed with a variety of well-known imbalanced data solutions meaning oversampling and undersampling. Then, we propose a CNN-based model in combination with SMOTE to effectively handle imbalanced data. To evaluate our methods, we have used KEEL, breast cancer, and Z-Alizadeh Sani datasets. In order to achieve reliable results, we conducted our experiments 100 times with randomly shuffled data distributions. The classification results demonstrate that the mixed Synthetic Minority Oversampling Technique (SMOTE)-Normalization-CNN outperforms different methodologies achieving 99.08% accuracy on the 24 imbalanced datasets. Therefore, the proposed mixed model can be applied to imbalanced binary classification problems on other real datasets.
引用
收藏
页数:34
相关论文
共 50 条
  • [1] AWSMOTE: An SVM-Based Adaptive Weighted SMOTE for Class-Imbalance Learning
    Wang, Jia-Bao
    Zou, Chun-An
    Fu, Guang-Hui
    SCIENTIFIC PROGRAMMING, 2021, 2021
  • [2] AWSMOTE: An SVM-Based Adaptive Weighted SMOTE for Class-Imbalance Learning
    Wang, Jia-Bao
    Zou, Chun-An
    Fu, Guang-Hui
    Scientific Programming, 2021, 2021
  • [3] Class-imbalance Learning based Discriminant Analysis
    Jing, Xiaoyuan
    Lan, Chao
    Li, Min
    Yao, Yongfang
    Zhang, David
    Yang, Jingyu
    2011 FIRST ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2011, : 545 - 549
  • [4] Exploratory Undersampling for Class-Imbalance Learning
    Liu, Xu-Ying
    Wu, Jianxin
    Zhou, Zhi-Hua
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2009, 39 (02): : 539 - 550
  • [5] Trainable Undersampling for Class-Imbalance Learning
    Peng, Minlong
    Zhang, Qi
    Xing, Xiaoyu
    Gui, Tao
    Huang, Xuanjing
    Jiang, Yu-Gang
    Ding, Keyu
    Chen, Zhigang
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4707 - 4714
  • [6] Graph-Based Class-Imbalance Learning With Label Enhancement
    Du, Guodong
    Zhang, Jia
    Jiang, Min
    Long, Jinyi
    Lin, Yaojin
    Li, Shaozi
    Tan, Kay Chen
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (09) : 6081 - 6095
  • [7] Using SMOTE to Deal with Class-Imbalance Problem in Bioactivity Data to Predict mTOR Inhibitors
    Kumari C.
    Abulaish M.
    Subbarao N.
    SN Computer Science, 2020, 1 (3)
  • [8] Towards graph-based class-imbalance learning for hospital readmission
    Du, Guodong
    Zhang, Jia
    Ma, Fenglong
    Zhao, Min
    Lin, Yaojin
    Li, Shaozi
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 176 (176)
  • [9] Adaptive Sampling with Optimal Cost for Class-Imbalance Learning
    Peng, Yuxin
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2921 - 2927
  • [10] Generating Counterfactual Instances for Explainable Class-Imbalance Learning
    Chen, Zhi
    Duan, Jiang
    Kang, Li
    Xu, Hongyan
    Chen, Rui
    Qiu, Guoping
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (03) : 1130 - 1144