ND-S: an oversampling algorithm based on natural neighbor and density peaks clustering

被引:0
|
作者
Guo, Ming [1 ]
Lu, Jia [1 ,2 ]
机构
[1] Chongqing Normal Univ, Coll Comp & Informat Sci, Chongqing 401331, Peoples R China
[2] Chongqing Digital Agr Serv Engn Technol Res Ctr, Chongqing 401331, Peoples R China
关键词
Imbalanced learning; Oversampling; Natural neighbor; Density peaks clustering; SAMPLING METHOD; SMOTE;
D O I
10.1007/s11227-022-04965-8
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
There are a large number of imbalanced classification problems in the real world. Due to the imbalance in the amount of data and the complex nature of the distribution, the minority class samples are difficult to be classified correctly. Oversampling techniques balance the data set by generating minority class samples; however, current clustering-based oversampling techniques are limited by hyperparameters, sample selection, and other issues that affect the final classification performance. In this paper, we propose an oversampling algorithm based on natural neighbor and density peaks clustering (ND-S). ND-S is divided into three steps. Firstly, the natural neighbor algorithm is used to find and filter noises and outliers. Secondly, the density peaks clustering is improved by natural neighbor-based nonparametric adaptive, which clusters all samples and leaves the clusters that meet the conditions. Finally, sampling weights are assigned to each cluster, and the minority class of samples suitable for oversampling is selected for synthetic minority oversampling (SMOTE) by calculating the local sparsity of the samples. Experiments on 18 imbalanced data sets show that ND-S is effective for the imbalanced classification problem, and its classification performance is generally better than other 8 comparison algorithms.
引用
收藏
页码:8668 / 8698
页数:31
相关论文
共 50 条
  • [1] ND-S: an oversampling algorithm based on natural neighbor and density peaks clustering
    Ming Guo
    Jia Lu
    The Journal of Supercomputing, 2023, 79 : 8668 - 8698
  • [2] An improved density peaks clustering algorithm based on natural neighbor with a merging strategy
    Ding, Shifei
    Du, Wei
    Xu, Xiao
    Shi, Tianhao
    Wang, Yanru
    Li, Chao
    INFORMATION SCIENCES, 2023, 624 : 252 - 276
  • [3] OALDPC: oversampling approach based on local density peaks clustering for imbalanced classification
    Li, Junnan
    Zhu, Qingsheng
    APPLIED INTELLIGENCE, 2023, 53 (24) : 30987 - 31017
  • [4] A Novel Oversampling Method for Imbalanced Datasets Based on Density Peaks Clustering
    Cao, Jie
    Shi, Yong
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2021, 28 (06): : 1813 - 1819
  • [5] Hierarchical clustering algorithm based on natural local density peaks
    Cai, Fapeng
    Feng, Ji
    Yang, Degang
    Chen, Zhongshang
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (11) : 7989 - 8004
  • [6] Density peaks clustering algorithm based on fuzzy and weighted shared neighbor for uneven density datasets
    Zhao, Jia
    Wang, Gang
    Pan, Jeng-Shyang
    Fan, Tanghuai
    Lee, Ivan
    PATTERN RECOGNITION, 2023, 139
  • [7] Density peaks clustering based on mutual neighbor degree
    Zhao J.
    Yao Z.-F.
    Lyu L.
    Fan T.-H.
    Kongzhi yu Juece/Control and Decision, 2021, 36 (03): : 543 - 552
  • [8] Natural local density-based adaptive oversampling algorithm for imbalanced classification
    Wang, Wentong
    Yang, Lijun
    Zhang, Jinghui
    Yang, Juntao
    Tang, Dongming
    Liu, Tao
    KNOWLEDGE-BASED SYSTEMS, 2024, 295
  • [9] OALDPC: oversampling approach based on local density peaks clustering for imbalanced classification
    Junnan Li
    Qingsheng Zhu
    Applied Intelligence, 2023, 53 : 30987 - 31017
  • [10] Improved density peaks clustering based on firefly algorithm
    Zhao J.
    Tang J.
    Shi A.
    Fan T.
    Xu L.
    Xu, Lizhong (lxu0530@126.com), 1600, Inderscience Enterprises Ltd. (15): : 24 - 42