Stratification-based semi-supervised clustering algorithm for arbitrary shaped datasets

被引:4
|
作者
Wang, Fei [2 ]
Li, Le [1 ]
Liu, Zhiqiang [3 ]
机构
[1] Inner Mongolia Elect Informat Vocat Tech Coll, Hohhot 010000, Peoples R China
[2] Inner Mongolia Peoples Congress, Hohhot 010000, Peoples R China
[3] Inner Mongolia Univ Technol, Hohhot 010000, Peoples R China
基金
中国国家自然科学基金;
关键词
Semi-supervised clustering; Kmeans; Seeded-Kmeans; Partitional clustering; Influence space; K-MEANS; SEARCH;
D O I
10.1016/j.ins.2023.119004
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semi-supervised clustering is not only an important branch of semi-supervised learning but also an improvement direction for clustering. Semi-supervised clustering algorithms designed based on Kmeans, such as the classical Seeded-Kmeans and Constrained-Kmeans, where supervision information is used to guide clustering iterations, have the same disadvantages as the original Kmeans algorithm: they are confined to the assumption of isotropic spherical clusters, leading to the narrow adaptability in handling data of various characteristics. To solve the problem, we propose the scattered centroids initialization clustering algorithm based on Stratification (SCICS). First, based on the concept of influence space, a method for modeling the cluster -level location of any object is presented, according to which we can obtain well-defined cluster decision boundaries through stratification. On this basis, by extending the seed thought, we propose a semi-supervised subclustering algorithm that can break through the limitations of partitional clustering methods that rely on strict assumptions on particular cluster distributions. Experiments on artificial and real-world datasets show that the proposed algorithm gains the ability of clustering arbitrary shaped data and surpasses the competitors in terms of performance and adaptability.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] A Research of Data Stratification Algorithm based on Semi-supervised Clustering
    Yang, Shaobo
    Yu, Jianmin
    Liu, Yi
    PROCEEDINGS OF 2015 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATCS AND COMPUTING (IEEE PIC), 2015, : 196 - 200
  • [2] Active Semi-Supervised Clustering Algorithm for Multi-Density Datasets
    Atwa, Walid
    Almazroi, Abdulwahab Ali
    Aldhahr, Eman A.
    Janbi, Nourah Fahad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (10) : 493 - 500
  • [3] Semi-supervised Clustering Based on Artificial Bee Colony Algorithm with Kernel Strategy
    Dai, Jianhua
    Han, Huifeng
    Hu, Hu
    Hu, Qinghua
    Wei, Bingjie
    Yan, Yuejun
    Web-Age Information Management, Pt II, 2016, 9659 : 403 - 414
  • [4] Semi-supervised clustering algorithm for haplotype assembly problem based on MEC model
    Xu, Xin-Shun
    Li, Ying-Xin
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2012, 6 (04) : 429 - 446
  • [5] Tri-training and data editing based semi-supervised clustering algorithm
    Deng, Chao
    Guo, Mao-Zu
    Ruan Jian Xue Bao/Journal of Software, 2008, 19 (03): : 663 - 673
  • [6] Semi-supervised clustering based on affinity propagation algorithm
    Xiao, Yu
    Yu, Jian
    Ruan Jian Xue Bao/Journal of Software, 2008, 19 (11): : 2803 - 2813
  • [7] Network anomaly detection based on semi-supervised clustering
    Wei Xiaotao
    Huang Houkuan
    Tian Shengfeng
    NEW ADVANCES IN SIMULATION, MODELLING AND OPTIMIZATION (SMO '07), 2007, : 440 - +
  • [8] Semi-supervised consensus clustering based on closed patterns
    Yang, Tianshu
    Pasquier, Nicolas
    Precioso, Frederic
    KNOWLEDGE-BASED SYSTEMS, 2022, 235
  • [9] An improved semi-supervised clustering algorithm based on initial center points
    Xia, Z. (xiazg@cumt.edu.cn), 1600, Advanced Institute of Convergence Information Technology (07): : 317 - 324
  • [10] Improved Semi-supervised Clustering Algorithm Based on Affinity Propagation
    金冉
    刘瑞娟
    李晔锋
    寇春海
    Journal of Donghua University(English Edition), 2015, 32 (01) : 125 - 131