Diversified Sensitivity-Based Undersampling for Imbalance Classification Problems

被引:175
|
作者
Ng, Wing W. Y. [1 ]
Hu, Junjie [2 ]
Yeung, Daniel S. [1 ]
Yin, Shaohua [1 ]
Roli, Fabio [3 ]
机构
[1] S China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Guangdong, Peoples R China
[2] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Hong Kong, Peoples R China
[3] Univ Cagliari, Dept Elect & Elect Engn, I-09123 Cagliari, Italy
基金
中国国家自然科学基金;
关键词
Diversified sensitivity undersampling (DSUS); imbalance data; sample selection; SMOTE;
D O I
10.1109/TCYB.2014.2372060
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Undersampling is a widely adopted method to deal with imbalance pattern classification problems. Current methods mainly depend on either random resampling on the majority class or resampling at the decision boundary. Random-based under-sampling fails to take into consideration informative samples in the data while resampling at the decision boundary is sensitive to class overlapping. Both techniques ignore the distribution information of the training dataset. In this paper, we propose a diversified sensitivity-based undersampling method. Samples of the majority class are clustered to capture the distribution information and enhance the diversity of the resampling. A stochastic sensitivity measure is applied to select samples from both clusters of the majority class and the minority class. By iteratively clustering and sampling, a balanced set of samples yielding high classifier sensitivity is selected. The proposed method yields a good generalization capability for 14 UCI datasets.
引用
收藏
页码:2402 / 2412
页数:11
相关论文
共 50 条
  • [1] WEIGHTED ENSEMBLE OF DIVERSIFIED SENSITIVITY-BASED UNDERSAMPLING FOR IMBALANCED PATTERN CLASSIFICATION PROBLEMS
    Chai, Yulin
    Zhang, Jianjun
    Ng, Wing W. Y.
    PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 1, 2017, : 42 - 47
  • [2] An Undersampling Method Approaching the Ideal Classification Boundary for Imbalance Problems
    Zhou, Wensheng
    Liu, Chen
    Yuan, Peng
    Jiang, Lei
    APPLIED SCIENCES-BASEL, 2024, 14 (13):
  • [3] Ddco - diversified data characteristic-based oversampling for imbalance classification problems
    REKHA G.
    REDDY V.K.
    Journal of Information Science and Engineering, 2021, 37 (05) : 1011 - 1023
  • [4] LOAN DEFAULT PREDICTION USING DIVERSIFIED SENSITIVITY UNDERSAMPLING
    Chen, Ya-Qi
    Zhang, Jianjun
    Ng, Wing W. Y.
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 1, 2018, : 240 - 245
  • [5] Undersampling of approaching the classification boundary for imbalance problem
    Jiang, Lei
    Yuan, Peng
    Liao, Jing
    Zhang, Qiongbing
    Liu, Jianxun
    Li, Keqin
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (06): : 1
  • [6] UNDERSAMPLING NEAR DECISION BOUNDARY FOR IMBALANCE PROBLEMS
    Zhang, Jianjun
    Wang, Ting
    Ng, Wing W. Y.
    Zhang, Shuai
    Nugent, Chris D.
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2019, : 553 - 560
  • [7] SOUL: Scala Oversampling and Undersampling Library for imbalance classification
    Rodriguez, Nestor
    Lopez, David
    Fernandez, Alberto
    Garcia, Salvador
    Herrera, Francisco
    SOFTWAREX, 2021, 15
  • [8] LOCATION BAGGING-BASED UNDERSAMPLING FOR IMBALANCED CLASSIFICATION PROBLEMS
    Rong, Tongwen
    Tian, Xing
    Ng, Wing W. Y.
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION (ICWAPR), 2016, : 72 - 77
  • [9] Anomaly detection-based undersampling for imbalanced classification problems
    Park, You-Jin
    Brito, Paula
    Ma, Yun-Chen
    ENGINEERING OPTIMIZATION, 2024, 56 (12) : 2565 - 2578
  • [10] Class-overlap undersampling based on Schur decomposition for Class-imbalance problems
    Dai, Qi
    Liu, Jian-wei
    Shi, Yong-hui
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 221