Combining Semi-supervised Clustering and Classification Under a Generalized Framework

被引:0
作者
Jiang, Zhen [1 ,2 ]
Zhao, Lingyun [1 ]
Lu, Yu [1 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Commun Engn, Zhenjiang, Peoples R China
[2] Jiangsu Prov Big Data Ubiquitous Percept & Intelli, Zhenjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Co-training; Classification; Semi-supervised clustering; Cluster-splitting;
D O I
10.1007/s00357-024-09489-9
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Most machine learning algorithms rely on having a sufficient amount of labeled data to train a reliable classifier. However, labeling data is often costly and time-consuming, while unlabeled data can be readily accessible. Therefore, learning from both labeled and unlabeled data has become a hot topic of interest. Inspired by the co-training algorithm, we present a learning framework called CSCC, which combines semi-supervised clustering and classification to learn from both labeled and unlabeled data. Unlike existing co-training style methods that construct diverse classifiers to learn from each other, CSCC leverages the diversity between semi-supervised clustering and classification models to achieve mutual enhancement. Existing classification algorithms can be easily adapted to CSCC, allowing them to generalize from a few labeled data. Especially, in order to bridge the gap between class information and clustering, we propose a semi-supervised hierarchical clustering algorithm that utilizes labeled data to guide the process of cluster-splitting. Within the CSCC framework, we introduce two loss functions to supervise the iterative updating of the semi-supervised clustering and classification models, respectively. Extensive experiments conducted on a variety of benchmark datasets validate the superiority of CSCC over other state-of-the-art methods.
引用
收藏
页码:181 / 204
页数:24
相关论文
共 50 条
  • [31] Semi-Supervised Clustering with Multiresolution Autoencoders
    Ienco, Dino
    Pensa, Ruggero G.
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [32] A survey on semi-supervised graph clustering
    Daneshfar, Fatemeh
    Soleymanbaigi, Sayvan
    Yamini, Pedram
    Amini, Mohammad Sadra
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133 (133)
  • [33] Semi-supervised hierarchical clustering algorithms
    Amar, A
    Labzour, NT
    Bensaid, A
    SIXTH SCANDINAVIAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 1997, 40 : 232 - 239
  • [34] A Semi-supervised Clustering for Incomplete Data
    Goel, Sonia
    Tushir, Meena
    APPLICATIONS OF ARTIFICIAL INTELLIGENCE TECHNIQUES IN ENGINEERING, SIGMA 2018, VOL 1, 2019, 698 : 323 - 331
  • [35] Active semi-supervised fuzzy clustering
    Grira, Nizar
    Crucianu, Michel
    Boujemaa, Nozha
    PATTERN RECOGNITION, 2008, 41 (05) : 1834 - 1844
  • [36] Composite kernels for semi-supervised clustering
    Carlotta Domeniconi
    Jing Peng
    Bojun Yan
    Knowledge and Information Systems, 2011, 28 : 99 - 116
  • [37] SemiSync: Semi-supervised Clustering by Synchronization
    Zhang, Zhong
    Kang, Didi
    Gao, Chongming
    Shao, Junming
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 358 - 362
  • [38] Weighted Semi-supervised Fuzzy Clustering
    Kong, Yi-qing
    Wang, Shi-tong
    FUZZY INFORMATION AND ENGINEERING, VOL 1, 2009, 54 : 465 - 470
  • [39] Brain image segmentation using semi-supervised clustering
    Saha, Sriparna
    Alok, Abhay Kumar
    Ekbal, Asif
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 52 : 50 - 63
  • [40] Pixel Classification of Remote Sensing Satellite Image using Semi-supervised Clustering
    Alok, Abhay Kumar
    Saha, Sriparna
    Ekbal, Asif
    2014 9TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2014, : 685 - 690