LDAMSS: Fast and efficient undersampling method for imbalanced learning

被引:0
作者
Ting Liang
Jie Xu
Bin Zou
Zhan Wang
Jingjing Zeng
机构
[1] Hubei University,Faculty of Computer Science and Information Engineering
[2] Hubei University,Faculty of Mathematics and Statistics, Hubei Key Laboratory of Applied Mathematics
[3] Hubei University,Faculty of Mathematics and Statistics
来源
Applied Intelligence | 2022年 / 52卷
关键词
Linear discriminant analysis (LDA); Imbalanced learning; Markov selective sampling (MSS); Undersampling;
D O I
暂无
中图分类号
学科分类号
摘要
In this article, a novel undersampling method based on linear discriminant analysis (LDA) and Markov selective sampling (MSS) is proposed. This method contains two stages. The first stage is to adjust the position of classification boundary according to the G-mean of LDA classifier for many times. The second stage is to extract the “important” training samples from the current majority class by MSS. We apply the proposed undersampling method to Xgboost and study its learning performance. The experimental results of binary class datasets show that compared to other methods, Xgboost based on LDAMSS (X-LDAMSS) not only has better performance in three metrics (F-measure, G-mean, and AUC), but also has less total time. We also apply X-LDAMSS to multi-classification problem and present some useful discussions.
引用
收藏
页码:6794 / 6811
页数:17
相关论文
共 74 条
[1]  
Zhu ZB(2010)Fault diagnosis based on imbalance modified kernel Fisher discriminant analysis Chem Eng Res Des 88 936-951
[2]  
Song ZH(2013)Effective detection of sophisticated online banking fraud on extremely imbalanced data World Wide Web 16 449-475
[3]  
Wei W(2011)Predicting disease risks from highly imbalanced data using random forest Bmc Medical Inform Decis Making 11 51-51
[4]  
Li JJ(2018)Handling data irregularities in classification: Foundations, trends, and future challenges Pattern Recogn 81 674-693
[5]  
Cao LB(2009)Cluster-based under-sampling approaches for imbalanced data distributions Expert Syst Appl 36 5718-5727
[6]  
Ou YM(2017)Diversified sensitivity-based undersampling for imbalance classification problems IEEE Trans Cybern 45 2402-2412
[7]  
Chen JH(1976)Two modifications of CNN IEEE Trans Syst Man Cybern 6 769-772
[8]  
Khalilia M(1976)An experiment with the edited nearest-neighbor rule IEEE Trans Syst Man Cybern 6 448-452
[9]  
Chakraborty S(2020)Radial-based undersampling for imbalanced data classification Pattern Recogn 102 107-262
[10]  
Popescu M(2017)Clustering-based undersampling in class-imbalanced data Inform Sci 409 17-26