Classifying imbalanced data using ensemble of reduced kernelized weighted extreme learning machine

被引:0
作者
Bhagat Singh Raghuwanshi
Sanyam Shukla
机构
[1] Maulana Azad National Institute of Technology,
来源
International Journal of Machine Learning and Cybernetics | 2019年 / 10卷
关键词
Reduced kernelized extreme learning machine; Imbalance problem; Random undersampling; Classification;
D O I
暂无
中图分类号
学科分类号
摘要
Many real-world applications are imbalance classification problems, where the number of samples present in one class is significantly less than the number of samples belonging to another class. The samples with larger and smaller class proportions are called majority and minority class respectively. Weighted extreme learning machine (WELM) was designed to handle the class imbalance problem. Several works such as boosting WELM (BWELM) and ensemble WELM extended WELM by using the ensemble method. All these variant use WELM with the sigmoid node to handle the class imbalance problem effectively. WELM with the sigmoid node suffers from the problem of performance fluctuation due to the random initialization of the weights between the input and the hidden layer. Hybrid artificial bee colony optimization-based WELM extends WELM by finding the optimal weights between the input and the hidden layer by using artificial bee colony optimization algorithm. The computational cost of the kernelized ELM is directly proportional to the number of kernel functions. So, this work proposes a novel ensemble using reduced kernelized WELM as the base classifier to solve the class imbalance problem more effectively. The proposed work uses random undersampling to design balanced training subsets, which act as the centroid of the reduced kernelized WELM classifier. The proposed ensemble generates the base classifier in a sequential manner. The majority class samples misclassified by the first base classifier along with all of the minority class samples act as the centroids of the second base classifier i.e. the base classifiers differ in the choice of the centroids for the kernel functions. The proposed work also has lower computational cost compared to BWELM. The proposed method is evaluated by utilizing the benchmark real-world imbalanced datasets taken from the KEEL dataset repository. The proposed method was also tested on binary synthetic datasets in order to analyze its robustness. The experimental results demonstrate the superiority of the proposed algorithm compared to the other state-of-the-art methods for the class imbalance learning. This is also revealed by the statistical tests conducted.
引用
收藏
页码:3071 / 3097
页数:26
相关论文
共 158 条
  • [31] Yadav RN(2013)An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics Inf Sci 250 113-141
  • [32] Raghuwanshi BS(2012)An experimental comparison of classification algorithms for imbalanced credit scoring data sets Expert Syst Appl 39 3446-3453
  • [33] Shukla S(2012)Dynamic classifier ensemble model for customer classification with imbalanced class distribution Expert Syst Appl 39 3668-3675
  • [34] Raghuwanshi BS(2016)Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy Appl Soft Comput 38 714-726
  • [35] Shukla S(2012)A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches IEEE Trans Syst Man Cybern Part C (Applications and Reviews) 42 463-484
  • [36] Raghuwanshi BS(2009)Exploratory undersampling for class-imbalance learning IEEE Trans Syst Man Cybern Part B (Cybernetics) 39 539-550
  • [37] Shukla S(2002)Smote: synthetic minority over-sampling technique J Artif Int Res 16 321-357
  • [38] Xiao W(2018)Classification of imbalanced data by oversampling in kernel space of support vector machines IEEE Trans Neural Netw Learn Syst 29 1-12
  • [39] Zhang J(2007)A weighted support vector machine for data classification Int J Pattern Recognit Artif Intell 21 961-976
  • [40] Li Y(2012)Extreme learning machine for regression and multiclass classification IEEE Trans Syst Man Cybern Part B (Cybernetics) 42 513-529