Pseudo-Label Guided Collective Matrix Factorization for Multiview Clustering

被引:93
作者
Wang, Di [1 ]
Han, Songwei [1 ]
Wang, Quan [1 ]
He, Lihuo [2 ]
Tian, Yumin [1 ]
Gao, Xinbo [2 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian 710071, Peoples R China
[2] Xidian Univ, Sch Elect Engn, Xian 710071, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Clustering methods; Interviews; Iterative methods; Cybernetics; Correlation; Manifolds; Computational efficiency; Matrix factorization; multiview clustering; pseudo-label; ALGORITHM;
D O I
10.1109/TCYB.2021.3051182
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multiview clustering has aroused increasing attention in recent years since real-world data are always comprised of multiple features or views. Despite the existing clustering methods having achieved promising performance, there still remain some challenges to be solved: 1) most existing methods are unscalable to large-scale datasets due to the high computational burden of eigendecomposition or graph construction and 2) most methods learn latent representations and cluster structures separately. Such a two-step learning scheme neglects the correlation between the two learning stages and may obtain a suboptimal clustering result. To address these challenges, a pseudo-label guided collective matrix factorization (PLCMF) method that jointly learns latent representations and cluster structures is proposed in this article. The proposed PLCMF first performs clustering on each view separately to obtain pseudo-labels that reflect the intraview similarities of each view. Then, it adds a pseudo-label constraint on collective matrix factorization to learn unified latent representations, which preserve the intraview and interview similarities simultaneously. Finally, it intuitively incorporates latent representation learning and cluster structure learning into a joint framework to directly obtain clustering results. Besides, the weight of each view is learned adaptively according to data distribution in the joint framework. In particular, the joint learning problem can be solved with an efficient iterative updating method with linear complexity. Extensive experiments on six benchmark datasets indicate the superiority of the proposed method over state-of-the-art multiview clustering methods in both clustering accuracy and computational efficiency.
引用
收藏
页码:8681 / 8691
页数:11
相关论文
共 42 条
[1]   Document clustering using locality preserving indexing [J].
Cai, D ;
He, XF ;
Han, JW .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (12) :1624-1637
[2]  
Cai X., 2013, P 23 INT JOINT C ART, P2598
[3]   Wavelet transform moments for feature extraction from temporal signals [J].
Carreno, Ignacio Rodriguez ;
Vuskovic, Marko .
INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS II, 2007, :235-+
[4]   CENSUS HISTOGRAMS: A SIMPLE FEATURE EXTRACTION AND MATCHING APPROACH FOR FACE RECOGNITION [J].
Chiachia, Giovani ;
Marana, Aparecido Nilceu ;
Ruf, Tobias ;
Ernst, Andreas .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2011, 25 (08) :1337-1348
[5]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[6]  
Ding C, 2005, SIAM PROC S, P606
[7]   Sparse Subspace Clustering: Algorithm, Theory, and Applications [J].
Elhamifar, Ehsan ;
Vidal, Rene .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (11) :2765-2781
[8]  
Greene Derek, 2006, PROC 23 INT C MACHIN, P377, DOI DOI 10.1145/1143844.1143892
[9]  
Hall D, 1965, ISODATA NOVEL METHOD
[10]  
Hartigan J. A., 1979, Applied Statistics, V28, P100, DOI 10.2307/2346830