Deep multi-view document clustering with enhanced semantic embedding

被引:39
作者
Bai, Ruina [1 ]
Huang, Ruizhang [1 ,2 ]
Chen, Yanping [1 ,2 ]
Qin, Yongbin [1 ,2 ]
机构
[1] Guizhou Univ, Coll Comp Sci & Technol, Guiyang, Peoples R China
[2] Guizhou Prov Key Lab Publ Big Data, Guiyang, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-view learning; Document clustering; Enhanced semantic mapping; REPRESENTATIONS;
D O I
10.1016/j.ins.2021.02.027
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-view clustering, which aims to group data with multiple views, has recently attracted intense research attention. Text documents bring additional difficulties to multi-view clustering due to the sparseness, high dimensionality, and inconsistency of document views. In this paper, we introduced a novel model on multi-view document clustering with enhanced semantic embedding, namely, MDCE, to address all of the above difficulties of clustering text documents with more than one representation view. Enhanced semantic embedders are designed to learn and improve the semantic mapping from higher-dimensional document space to lower-dimensional feature space with complementary semantic information. Specifically, three types of complementary semantic information are involved in an unsupervised manner: neighbour-wise, view-wise, and cluster-wise complementary information. A deep network is designed to optimize the enhanced semantic mapping, integrate lower-dimensional features from multiple views, and discover document clustering assignments simultaneously. We conducted extensive experiments on our proposed MDCE model by using realistic datasets compared with a number of state-of-the-art multi-view clustering approaches. Experimental results demonstrate that the MDCE-related models perform substantially better than all other models. (c) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:273 / 287
页数:15
相关论文
共 42 条
[1]  
[Anonymous], IEEE T CYBERN
[2]  
[Anonymous], 2010, P 13 MULT INF SOC
[3]   Multi-view clustering [J].
Bickel, S ;
Scheffer, T .
FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, :19-26
[4]  
Blei DM, 2003, P 26 ANN INT ACM SIG, P127, DOI DOI 10.1145/860435.860460
[5]   Multi-view low-rank sparse subspace clustering [J].
Brbic, Maria ;
Kopriva, Ivica .
PATTERN RECOGNITION, 2018, 73 :247-258
[6]  
Chao G.-L., ARXIV PREPRINT ARXIV
[7]  
Chaudhuri K., 2009, P 26 ANN INT C MACHI, V26, P129
[8]   One2Multi Graph Autoencoder for Multi-view Graph Clustering [J].
Fan, Shaohua ;
Wang, Xiao ;
Shi, Chuan ;
Lu, Emiao ;
Lin, Ken ;
Wang, Bai .
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, :3070-3076
[9]   Cross-modal Retrieval with Correspondence Autoencoder [J].
Feng, Fangxiang ;
Wang, Xiaojie ;
Li, Ruifan .
PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, :7-16
[10]  
GLOROT X., 2011, INT C ARTIFICIAL INT, P315