A Hybrid MultiLayer Perceptron Under-Sampling with Bagging Dealing with a Real-Life Imbalanced Rice Dataset

被引:1
|
作者
Diallo, Moussa [1 ,2 ]
Xiong, Shengwu [1 ]
Emiru, Eshete Derb [1 ,3 ]
Fesseha, Awet [1 ,4 ]
Abdulsalami, Aminu Onimisi [1 ]
Abd Elaziz, Mohamed [5 ]
机构
[1] Wuhan Univ Technol, Sch Comp Sci & Technol, Wuhan 430070, Peoples R China
[2] Ecole Normale Super, Dept Math & Comp Sci, Bamako 241, Mali
[3] Debre Markos Univ, Sch Comp, Debremarkos 269, Ethiopia
[4] Mekele Univ, Sch Nat Sci & Comp, Mekelle 231, Ethiopia
[5] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan 430074, Peoples R China
关键词
classification algorithms; imbalanced dataset; climate change; rice dataset; office du Niger; DATA-SETS; SMOTE; CLASSIFICATION; PERFORMANCE; ENSEMBLES;
D O I
10.3390/info12080291
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Classification algorithms have shown exceptional prediction results in the supervised learning area. These classification algorithms are not always efficient when it comes to real-life datasets due to class distributions. As a result, datasets for real-life applications are generally imbalanced. Several methods have been proposed to solve the problem of class imbalance. In this paper, we propose a hybrid method combining the preprocessing techniques and those of ensemble learning. The original training set is undersampled by evaluating the samples by stochastic measurement (SM) and then training these samples selected by Multilayer Perceptron to return a balanced training set. The MLPUS (Multilayer perceptron undersampling) balanced training set is aggregated using the bagging ensemble method. We applied our method to the real-life Niger_Rice dataset and forty-four other imbalanced datasets from the KEEL repository in this study. We also compared our method with six other existing methods in the literature, such as the MLP classifier on the original imbalance dataset, MLPUS, UnderBagging (combining random under-sampling and bagging), RUSBoost, SMOTEBagging (Synthetic Minority Oversampling Technique and bagging), SMOTEBoost. The results show that our method is competitive compared to other methods. The Niger_Rice real-life dataset results are 75.6, 0.73, 0.76, and 0.86, respectively, for accuracy, F-measure, G-mean, and ROC with our proposed method. In contrast, the MLP classifier on the original imbalance Niger_Rice dataset gives results 72.44, 0.82, 0.59, and 0.76 respectively for accuracy, F-measure, G-mean, and ROC.
引用
收藏
页数:21
相关论文
共 17 条
  • [1] Improving Classification of Imbalanced Student Dataset Using Ensemble Method of Voting, Bagging, and Adaboost with Under-Sampling Technique
    Punlumjeak, Wattana
    Rugtanom, Sitti
    Jantarat, Samatachai
    Rachburee, Nachirat
    IT CONVERGENCE AND SECURITY 2017, VOL 1, 2018, 449 : 27 - 34
  • [2] Evolutionary under-sampling based bagging ensemble method for imbalanced data classification
    Sun, Bo
    Chen, Haiyan
    Wang, Jiandong
    Xie, Hua
    FRONTIERS OF COMPUTER SCIENCE, 2018, 12 (02) : 331 - 350
  • [3] Evolutionary under-sampling based bagging ensemble method for imbalanced data classification
    Bo Sun
    Haiyan Chen
    Jiandong Wang
    Hua Xie
    Frontiers of Computer Science, 2018, 12 : 331 - 350
  • [4] An Under-sampling Method Based on Fuzzy Logic for Large Imbalanced Dataset
    Wong, Ginny Y.
    Leung, Frank H. F.
    Ling, Sai-Ho
    2014 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2014, : 1248 - 1252
  • [5] Under-sampling approaches for improving prediction of the minority class in an imbalanced dataset
    Yen, Show-Jane
    Lee, Yue-Shi
    INTELLIGENT CONTROL AND AUTOMATION, 2006, 344 : 731 - 740
  • [6] A Hybrid Under-Sampling Method (HUSBoost) to Classify Imbalanced Data
    Popel, Mahmudul Hasan
    Hasib, Khan Md
    Habib, Syed Ahsan
    Shah, Faisal Muhammad
    2018 21ST INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2018,
  • [7] A New Hybrid Under-sampling Approach to Imbalanced Classification Problems
    Peng, Chun-Yang
    Park, You-Jin
    APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
  • [8] A K-means Clustering Based Under-Sampling Method for Imbalanced Dataset Classification
    Huang, Chih-Ming
    Hung, Chuan-Sheng
    Hsu, Yao-Yuan
    Zheng, You-Cheng
    Yu, Cheng-Han
    Lin, Chun-Hung Richard
    Chen, Shi-Huang
    38TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN 2024, 2024, : 708 - 713
  • [9] A Selective Under-Sampling based Bagging SVM for Imbalanced Data Learning in Biomedical Event Trigger Recognition
    Chen, Yifei
    2018 2ND INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND BIOINFORMATICS (ICBEB 2018), 2018, : 112 - 119
  • [10] Bagging of Xgboost Classifiers with Random Under-sampling and Tomek Link for Noisy Label-imbalanced Data
    Luo Ruisen
    Dian Songyi
    Wang Chen
    Cheng Peng
    Tang Zuodong
    Yu YanMei
    Wang Shixiong
    3RD INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL AND ROBOTICS ENGINEERING (CACRE 2018), 2018, 428