Stratification-based semi-supervised clustering algorithm for arbitrary shaped datasets

被引:4
作者
Wang, Fei [2 ]
Li, Le [1 ]
Liu, Zhiqiang [3 ]
机构
[1] Inner Mongolia Elect Informat Vocat Tech Coll, Hohhot 010000, Peoples R China
[2] Inner Mongolia Peoples Congress, Hohhot 010000, Peoples R China
[3] Inner Mongolia Univ Technol, Hohhot 010000, Peoples R China
基金
中国国家自然科学基金;
关键词
Semi-supervised clustering; Kmeans; Seeded-Kmeans; Partitional clustering; Influence space; K-MEANS; SEARCH;
D O I
10.1016/j.ins.2023.119004
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semi-supervised clustering is not only an important branch of semi-supervised learning but also an improvement direction for clustering. Semi-supervised clustering algorithms designed based on Kmeans, such as the classical Seeded-Kmeans and Constrained-Kmeans, where supervision information is used to guide clustering iterations, have the same disadvantages as the original Kmeans algorithm: they are confined to the assumption of isotropic spherical clusters, leading to the narrow adaptability in handling data of various characteristics. To solve the problem, we propose the scattered centroids initialization clustering algorithm based on Stratification (SCICS). First, based on the concept of influence space, a method for modeling the cluster -level location of any object is presented, according to which we can obtain well-defined cluster decision boundaries through stratification. On this basis, by extending the seed thought, we propose a semi-supervised subclustering algorithm that can break through the limitations of partitional clustering methods that rely on strict assumptions on particular cluster distributions. Experiments on artificial and real-world datasets show that the proposed algorithm gains the ability of clustering arbitrary shaped data and surpasses the competitors in terms of performance and adaptability.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Research of semi-supervised spectral clustering algorithm based on pairwise constraints
    Ding, Shifei
    Jia, Hongjie
    Zhang, Liwen
    Jin, Fengxiang
    NEURAL COMPUTING & APPLICATIONS, 2014, 24 (01) : 211 - 219
  • [22] Research of semi-supervised spectral clustering algorithm based on pairwise constraints
    Shifei Ding
    Hongjie Jia
    Liwen Zhang
    Fengxiang Jin
    Neural Computing and Applications, 2014, 24 : 211 - 219
  • [23] A semi-supervised clustering algorithm for network intrusion detection
    Wei X.-T.
    Huang H.-K.
    Tian S.-F.
    Tiedao Xuebao/Journal of the China Railway Society, 2010, 32 (01): : 49 - 53
  • [24] An Efficient Semi-Supervised Clustering Algorithm with Sequential Constraints
    Yi, Jinfeng
    Zhang, Lijun
    Yang, Tianbao
    Liu, Wei
    Wang, Jun
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 1405 - 1414
  • [25] MVS-based Semi-Supervised Clustering
    Yan, Yang
    Chen, Lihui
    Chan, Chee Keong
    2013 9TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING (ICICS), 2013,
  • [26] An efficient semi-supervised graph based clustering
    Viet-Vu Vu
    INTELLIGENT DATA ANALYSIS, 2018, 22 (02) : 297 - 307
  • [27] Density-based semi-supervised clustering
    Carlos Ruiz
    Myra Spiliopoulou
    Ernestina Menasalvas
    Data Mining and Knowledge Discovery, 2010, 21 : 345 - 370
  • [28] Semi-Supervised Clustering Based on Exemplars Constraints
    Wang, Sailan
    Yang, Zhenzhi
    Yang, Jin
    Wang, Hongjun
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (06) : 1231 - 1241
  • [29] Density-based semi-supervised clustering
    Ruiz, Carlos
    Spiliopoulou, Myra
    Menasalvas, Ernestina
    DATA MINING AND KNOWLEDGE DISCOVERY, 2010, 21 (03) : 345 - 370
  • [30] Two Novel Kernel-based Semi-supervised Clustering Methods by Seeding
    Gu, Lei
    Sun, Fuchun
    PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 78 - 82