Classifying imbalanced data using BalanceCascade-based kernelized extreme learning machine

被引:0
作者
Bhagat Singh Raghuwanshi
Sanyam Shukla
机构
[1] Maulana Azad National Institute of Technology,
来源
Pattern Analysis and Applications | 2020年 / 23卷
关键词
Imbalanced learning; Classification; Kernelized extreme learning machine; BalanceCascade ensemble; Voting methods;
D O I
暂无
中图分类号
学科分类号
摘要
Imbalanced learning is one of the substantial challenging problems in the field of data mining. The datasets that have skewed class distribution pose hindrance to conventional learning methods. Conventional learning methods give the same importance to all the examples. This leads to the prediction inclined in favor of the majority classes. To solve this intrinsic deficiency, numerous strategies have been proposed such as weighted extreme learning machine (WELM) and boosting WELM (BWELM). This work designs a novel BalanceCascade-based kernelized extreme learning machine (BCKELM) to tackle the class imbalance problem more effectively. BalanceCascade includes the merits of random undersampling and the ensemble methods. The proposed method utilizes random undersampling to design balanced training subsets. The proposed ensemble generates the base learner in a sequential manner. In each iteration, the correctly classified examples belonging to the majority class are replaced by the other majority class examples to create a new balanced training subset, i.e., the base learners differ in the choice of the balanced training subset. The cardinality of the balanced training subsets depends on the imbalance ratio. This work utilizes a kernelized extreme learning machine (KELM) as the base learner to build the ensemble as it is stable and has good generalization performance. The time complexity of BCKELM is considerably lower in contrast to BWELM, BalanceCascade, EasyEnsemble and hybrid artificial bee colony WELM. The exhaustive experimental evaluation on real-world benchmark datasets demonstrates the efficacy of the proposed method.
引用
收藏
页码:1157 / 1182
页数:25
相关论文
共 133 条
[1]  
Alcalá J(2011)Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework J Multiple Valued Logic Soft Comput 17 255-287
[2]  
Fernández A(2018)Learning a single-hidden layer feedforward neural network using a rank correlation-based strategy with application to high dimensional gene expression and proteomic spectra datasets in cancer detection J Biomed Inform 83 159-166
[3]  
Luengo J(1997)The use of the area under the ROC curve in the evaluation of machine learning algorithms Pattern Recognit 30 1145-1159
[4]  
Derrac J(2002)Smote: synthetic minority over-sampling technique J Artif Int Res 16 321-357
[5]  
García S(2006)Statistical comparisons of classifiers over multiple data sets J Mach Learn Res 7 1-30
[6]  
Sánchez L(2012)A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches IEEE Trans Syst Man Cybernet Part C (Appl Rev) 42 463-484
[7]  
Herrera F(2017)Learning from class-imbalanced data: review of methods and applications Expert Syst Appl 73 220-239
[8]  
Belciug S(2009)Learning from imbalanced data IEEE Trans Knowl Data Eng 21 1263-1284
[9]  
Gorunescu F(2015)Trends in extreme learning machines: a review Neural Netw 61 32-48
[10]  
Bradley AP(2006)Extreme learning machine: theory and applications Neurocomputing 70 489-501