ND-S: an oversampling algorithm based on natural neighbor and density peaks clustering

被引:0
|
作者
Guo, Ming [1 ]
Lu, Jia [1 ,2 ]
机构
[1] Chongqing Normal Univ, Coll Comp & Informat Sci, Chongqing 401331, Peoples R China
[2] Chongqing Digital Agr Serv Engn Technol Res Ctr, Chongqing 401331, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2023年 / 79卷 / 08期
关键词
Imbalanced learning; Oversampling; Natural neighbor; Density peaks clustering; SAMPLING METHOD; SMOTE;
D O I
10.1007/s11227-022-04965-8
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
There are a large number of imbalanced classification problems in the real world. Due to the imbalance in the amount of data and the complex nature of the distribution, the minority class samples are difficult to be classified correctly. Oversampling techniques balance the data set by generating minority class samples; however, current clustering-based oversampling techniques are limited by hyperparameters, sample selection, and other issues that affect the final classification performance. In this paper, we propose an oversampling algorithm based on natural neighbor and density peaks clustering (ND-S). ND-S is divided into three steps. Firstly, the natural neighbor algorithm is used to find and filter noises and outliers. Secondly, the density peaks clustering is improved by natural neighbor-based nonparametric adaptive, which clusters all samples and leaves the clusters that meet the conditions. Finally, sampling weights are assigned to each cluster, and the minority class of samples suitable for oversampling is selected for synthetic minority oversampling (SMOTE) by calculating the local sparsity of the samples. Experiments on 18 imbalanced data sets show that ND-S is effective for the imbalanced classification problem, and its classification performance is generally better than other 8 comparison algorithms.
引用
收藏
页码:8668 / 8698
页数:31
相关论文
共 50 条
  • [1] ND-S: an oversampling algorithm based on natural neighbor and density peaks clustering
    Ming Guo
    Jia Lu
    The Journal of Supercomputing, 2023, 79 : 8668 - 8698
  • [2] An improved density peaks clustering algorithm based on natural neighbor with a merging strategy
    Ding, Shifei
    Du, Wei
    Xu, Xiao
    Shi, Tianhao
    Wang, Yanru
    Li, Chao
    INFORMATION SCIENCES, 2023, 624 : 252 - 276
  • [3] Improved Density Peaks Clustering Based on Natural Neighbor Expanded Group
    Ding, Lin
    Xu, Weihong
    Chen, Yuantao
    COMPLEXITY, 2020, 2020 (2020)
  • [4] Density Peaks Clustering Algorithm Based on Shared Neighbor Degree and Probability Assignment
    Zhu, Hongxiang
    Wu, Genxiu
    Wang, Zhaohui
    Computer Engineering and Applications, 60 (12): : 74 - 90
  • [5] Hierarchical clustering algorithm based on natural local density peaks
    Cai, Fapeng
    Feng, Ji
    Yang, Degang
    Chen, Zhongshang
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (11) : 7989 - 8004
  • [6] A domain density peak clustering algorithm based on natural neighbor
    Chen, Di
    Du, Tao
    Zhou, Jin
    Shen, Tianyu
    INTELLIGENT DATA ANALYSIS, 2023, 27 (02) : 443 - 462
  • [7] Natural Neighbor-based Clustering Algorithm with Density Peeks
    Cheng, Dongdong
    Zhu, Qingsheng
    Huang, Jinlong
    Yang, Lijun
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 92 - 98
  • [8] Density peaks clustering algorithm based on fuzzy and weighted shared neighbor for uneven density datasets
    Zhao, Jia
    Wang, Gang
    Pan, Jeng-Shyang
    Fan, Tanghuai
    Lee, Ivan
    PATTERN RECOGNITION, 2023, 139
  • [9] Density peaks clustering based on mutual neighbor degree
    Zhao J.
    Yao Z.-F.
    Lyu L.
    Fan T.-H.
    Kongzhi yu Juece/Control and Decision, 2021, 36 (03): : 543 - 552
  • [10] Natural Neighbor Density Extremum Clustering Algorithm
    Zhang, Zhonglin
    Zhao, Yu
    Yan, Guanghui
    Computer Engineering and Applications, 2024, 57 (23) : 200 - 201