Learning Representation for Clustering Via Prototype Scattering and Positive Sampling

被引：46

作者：

Huang, Zhizhong ^{[1
,2
]}

Chen, Jie ^{[1
,2
]}

Zhang, Junping ^{[1
,2
]}

Shan, Hongming ^{[3
,4
]}

机构：

[1] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, Shanghai 200433, Peoples R China

[2] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China

[3] Fudan Univ, Inst Sci & Technol Brain Inspired Intelligence, MOE Frontiers Ctr Brain Sci, Key Lab Computat Neurosci & Brain Inspired Intelli, Shanghai 200433, Peoples R China

[4] Shanghai Ctr Brain Sci & Brain Inspired Technol, Shanghai 201210, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2023年 / 45卷 / 06期

基金：

中国国家自然科学基金; 上海市自然科学基金;

关键词：

Prototypes; Scattering; Representation learning; Task analysis; Self-supervised learning; Clustering methods; Semantics; Contrastive learning; deep clustering; representation learning; self-supervised learning; unsupervised learning;

D O I：

10.1109/TPAMI.2022.3216454

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Existing deep clustering methods rely on either contrastive or non-contrastive representation learning for downstream clustering task. Contrastive-based methods thanks to negative pairs learn uniform representations for clustering, in which negative pairs, however, may inevitably lead to the class collision issue and consequently compromise the clustering performance. Non-contrastive-based methods, on the other hand, avoid class collision issue, but the resulting non-uniform representations may cause the collapse of clustering. To enjoy the strengths of both worlds, this paper presents a novel end-to-end deep clustering method with prototype scattering and positive sampling, termed ProPos. Specifically, we first maximize the distance between prototypical representations, named prototype scattering loss, which improves the uniformity of representations. Second, we align one augmented view of instance with the sampled neighbors of another view-assumed to be truly positive pair in the embedding space-to improve the within-cluster compactness, termed positive sampling alignment. The strengths of ProPos are avoidable class collision issue, uniform representations, well-separated clusters, and within-cluster compactness. By optimizing ProPos in an end-to-end expectation-maximization framework, extensive experimental results demonstrate that ProPos achieves competing performance on moderate-scale clustering benchmark datasets and establishes new state-of-the-art performance on large-scale datasets. Source code is available at https://github.com/Hzzone/ProPos.

引用

页码：7509 / 7524

页数：16

共 68 条

[1] [Anonymous], 2019, Advances in neural information processing systems
[2] Arora S, 2019, PR MACH LEARN RES, V97
[3] Arora Sanjeev, 2019, ARXIV, V97
[4] Arthur D, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1027
[5] Asano YM, 2019, INT C LEARN REPR
[6] Cao KD, 2019, ADV NEUR IN, V32
[7] Caron M, 2020, ADV NEUR IN, V33
[8] Deep Clustering for Unsupervised Learning of Visual Features
Caron, Mathilde
Bojanowski, Piotr
Joulin, Armand
Douze, Matthijs
[J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 139 - 156
[9] Local-Aggregation Graph Networks
Chang, Jianlong
Wang, Lingfeng
Meng, Gaofeng
Zhang, Qi
Xiang, Shiming
Pan, Chunhong
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (11) : 2874 - 2886
[10] Deep Self-Evolution Clustering
Chang, Jianlong
Meng, Gaofeng
Wang, Lingfeng
Xiang, Shiming
Pan, Chunhong
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (04) : 809 - 823

← 1 2 3 4 5 6 7 →