Pyramid contrastive learning for clustering

被引:2
作者
Zhou, Zi-Feng [1 ]
Huang, Dong [1 ,3 ]
Wang, Chang-Dong [2 ,4 ]
机构
[1] South China Agr Univ, Coll Math & Informat, Guangzhou, Peoples R China
[2] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou, Peoples R China
[3] Minist Agr & Rural Affairs, Key Lab Smart Agr Technol Trop South China, Guangzhou, Peoples R China
[4] Guangdong Key Lab Big Data Anal & Proc, Guangzhou, Peoples R China
关键词
Data clustering; Deep clustering; Contrastive clustering; Image clustering; CNN-transformer encoder;
D O I
10.1016/j.neunet.2025.107217
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With its ability of joint representation learning and clustering via deep neural networks, the deep clustering have gained significant attention in recent years. Despite the considerable progress, most of the previous deep clustering methods still suffer from three critical limitations. First, they tend to associate some distribution- based clustering loss to the neural network, which often overlook the sample-wise contrastiveness for discriminative representation learning. Second, they generally utilize the features learned at a single layer for the clustering process, which, surprisingly, cannot go beyond a single layer to explore multiple layers for joint multi-layer (multi-stage) learning. Third, they typically use the convolutional neural network (CNN) for clustering images, which focus on local information yet cannot well capture the global dependencies. To tackle these issues, this paper presents anew deep clustering method called pyramid contrastive learning for clustering (PCLC), which is able to incorporate a pyramidal contrastive architecture to jointly enforce contrastive learning and clustering at multiple network layers (or stages). Particularly, for an input image, two types of augmentations are first performed to generate two paralleled augmented views. To bridge the gap between the CNN (for capturing local information) and the Transformer (for reflecting global dependencies), a mixed CNN-Transformer based encoder is utilized as the backbone, whose CNN-Transformer blocks are further divided into four stages, thus giving rise to a pyramid of multi-stage feature representations. Thereafter, multiple stages of twin contrastive learning are simultaneously conducted at both the instance-level and the cluster-level, through the optimization of which the final clustering can be achieved. Extensive experiments on multiple challenging image datasets demonstrate the superior clustering performance of PCLC over the state-of-the-art. The source code is available at https://github.com/Zachary-Chow/PCLC.
引用
收藏
页数:12
相关论文
共 50 条
[1]  
[Anonymous], 2006, P 19 INT C NEURAL IN
[2]  
Cai D, 2009, 21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, P1010
[3]   Seeking commonness and inconsistencies: A jointly smoothed approach to multi-view subspace clustering [J].
Cai, Xiaosha ;
Huang, Dong ;
Zhang, Guang-Yu ;
Wang, Chang-Dong .
INFORMATION FUSION, 2023, 91 :364-375
[4]   Emerging Properties in Self-Supervised Vision Transformers [J].
Caron, Mathilde ;
Touvron, Hugo ;
Misra, Ishan ;
Jegou, Herve ;
Mairal, Julien ;
Bojanowski, Piotr ;
Joulin, Armand .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9630-9640
[5]   Deep Clustering for Unsupervised Learning of Visual Features [J].
Caron, Mathilde ;
Bojanowski, Piotr ;
Joulin, Armand ;
Douze, Matthijs .
COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :139-156
[6]  
Chang JL, 2019, Arxiv, DOI arXiv:1905.01681
[7]  
Chang JL, 2017, IEEE I CONF COMP VIS, P5880, DOI [10.1109/ICCV.2017.626, 10.1109/ICCV.2017.627]
[8]  
Chen T., 2020, PROC 37 INT C MACH L, P1597
[9]   An Empirical Study of Training Self-Supervised Vision Transformers [J].
Chen, Xinlei ;
Xie, Saining ;
He, Kaiming .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9620-9629
[10]   Deep and Structure-Preserving Autoencoders for Clustering Data With Missing Information [J].
Choudhury, Suvra Jyoti ;
Pal, Nikhil Ranjan .
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2021, 5 (04) :639-650