Combining Semi-supervised Clustering and Classification Under a Generalized Framework

被引:0
|
作者
Jiang, Zhen [1 ,2 ]
Zhao, Lingyun [1 ]
Lu, Yu [1 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Commun Engn, Zhenjiang, Peoples R China
[2] Jiangsu Prov Big Data Ubiquitous Percept & Intelli, Zhenjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Co-training; Classification; Semi-supervised clustering; Cluster-splitting;
D O I
10.1007/s00357-024-09489-9
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Most machine learning algorithms rely on having a sufficient amount of labeled data to train a reliable classifier. However, labeling data is often costly and time-consuming, while unlabeled data can be readily accessible. Therefore, learning from both labeled and unlabeled data has become a hot topic of interest. Inspired by the co-training algorithm, we present a learning framework called CSCC, which combines semi-supervised clustering and classification to learn from both labeled and unlabeled data. Unlike existing co-training style methods that construct diverse classifiers to learn from each other, CSCC leverages the diversity between semi-supervised clustering and classification models to achieve mutual enhancement. Existing classification algorithms can be easily adapted to CSCC, allowing them to generalize from a few labeled data. Especially, in order to bridge the gap between class information and clustering, we propose a semi-supervised hierarchical clustering algorithm that utilizes labeled data to guide the process of cluster-splitting. Within the CSCC framework, we introduce two loss functions to supervise the iterative updating of the semi-supervised clustering and classification models, respectively. Extensive experiments conducted on a variety of benchmark datasets validate the superiority of CSCC over other state-of-the-art methods.
引用
收藏
页码:181 / 204
页数:24
相关论文
共 50 条
  • [21] Porcelain Image Classification Based on Semi-supervised Mean Shift Clustering
    Zhou, Pengbo
    Wang, Kegang
    PROCEEDINGS OF 2017 8TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2017), 2017, : 791 - 797
  • [22] Semi-supervised Clustering Framework Based on Active Learning for Real Data
    Odate, Ryosuke
    Shinjo, Hiroshi
    Suzuki, Yasufumi
    Motobayashi, Masahiro
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2018, 2018, 11004 : 184 - 193
  • [23] A semi-supervised framework for concept-based hierarchical document clustering
    Sadjadi, Seyed Mojtaba
    Mashayekhi, Hoda
    Hassanpour, Hamid
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (06): : 3861 - 3890
  • [24] Clustering and semi-supervised classification for clickstream data via mixture models
    Gallaugher, Michael P. B.
    Mcnicholas, Paul D.
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2024, 52 (03): : 678 - 695
  • [25] A classification-based approach to semi-supervised clustering with pairwise constraints
    Smieja, Marek
    Struski, Lukasz
    Figueiredo, Mario A. T.
    NEURAL NETWORKS, 2020, 127 : 193 - 203
  • [26] A semi-supervised framework for concept-based hierarchical document clustering
    Seyed Mojtaba Sadjadi
    Hoda Mashayekhi
    Hamid Hassanpour
    World Wide Web, 2023, 26 : 3861 - 3890
  • [27] Spectral clustering: A semi-supervised approach
    Chen, Weifu
    Feng, Guocan
    NEUROCOMPUTING, 2012, 77 (01) : 229 - 242
  • [28] Research Progress on Semi-Supervised Clustering
    Yue Qin
    Shifei Ding
    Lijuan Wang
    Yanru Wang
    Cognitive Computation, 2019, 11 : 599 - 612
  • [29] Composite kernels for semi-supervised clustering
    Domeniconi, Carlotta
    Peng, Jing
    Yan, Bojun
    KNOWLEDGE AND INFORMATION SYSTEMS, 2011, 28 (01) : 99 - 116
  • [30] A Semi-supervised Clustering for Incomplete Data
    Goel, Sonia
    Tushir, Meena
    APPLICATIONS OF ARTIFICIAL INTELLIGENCE TECHNIQUES IN ENGINEERING, SIGMA 2018, VOL 1, 2019, 698 : 323 - 331