Structural Deep Clustering Network

被引:368
作者
Bo, Deyu [1 ]
Wang, Xiao [1 ]
Shi, Chuan [1 ]
Zhu, Meiqi [1 ]
Lu, Emiao [2 ]
Cui, Peng [3 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
[2] Tencent, Shenzhen, Peoples R China
[3] Tsinghua Univ, Beijing, Peoples R China
来源
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020) | 2020年
基金
中国国家自然科学基金;
关键词
deep clustering; graph convolutional network; neural network; self-supervised learning; DIMENSIONALITY;
D O I
10.1145/3366423.3380214
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is a fundamental task in data analysis. Recently, deep clustering, which derives inspiration primarily from deep learning approaches, achieves state-of-the-art performance and has attracted considerable attention. Current deep clustering methods usually boost the clustering results by means of the powerful representation ability of deep learning, e.g., autoencoder, suggesting that learning an effective representation for clustering is a crucial requirement. The strength of deep clustering methods is to extract the useful representations from the data itself, rather than the structure of data, which receives scarce attention in representation learning. Motivated by the great success of Graph Convolutional Network (GCN) in encoding the graph structure, we propose a Structural Deep Clustering Network (SDCN) to integrate the structural information into deep clustering. Specifically, we design a delivery operator to transfer the representations learned by autoencoder to the corresponding GCN layer, and a dual self-supervised mechanism to unify these two different deep neural architectures and guide the update of the whole model. In this way, the multiple structures of data, from low-order to high-order, are naturally combined with the multiple representations learned by autoencoder. Furthermore, we theoretically analyze the delivery operator, i.e., with the delivery operator, GCN improves the autoencoder-specific representation as a high-order graph regularization constraint and autoencoder helps alleviate the over-smoothing problem in GCN. Through comprehensive experiments, we demonstrate that our propose model can consistently perform better over the state-of-the-art techniques.
引用
收藏
页码:1400 / 1410
页数:11
相关论文
共 31 条
[1]  
Aggarwal C.C., 2012, Mining text data, DOI [DOI 10.1007/978-1-4614-3223-46, 10.1007/978-1-4614-3223-4, DOI 10.1007/978-1-4614-3223-4_4]
[2]  
[Anonymous], 2015, CoRR abs/1511.05644
[3]  
[Anonymous], 2016, BAYESIAN DEEP LEARNI
[4]  
[Anonymous], 2017, PROC INT C LEARN REP
[5]  
[Anonymous], 2010, INT C MACH LEARN
[6]   Laplacian eigenmaps for dimensionality reduction and data representation [J].
Belkin, M ;
Niyogi, P .
NEURAL COMPUTATION, 2003, 15 (06) :1373-1396
[7]   Deep Clustering for Unsupervised Learning of Visual Features [J].
Caron, Mathilde ;
Bojanowski, Piotr ;
Joulin, Armand ;
Douze, Matthijs .
COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :139-156
[8]  
Guo XF, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1753
[9]  
Hartigan J. A., 1979, Applied Statistics, V28, P100, DOI 10.2307/2346830
[10]   A Survey of Deep Learning: Platforms, Applications and Emerging Rlesearch Trends [J].
Hatcher, William Grant ;
Yu, Wei .
IEEE ACCESS, 2018, 6 :24411-24432