Adversarial Multiview Clustering Networks With Adaptive Fusion

被引:48
作者
Wang, Qianqian [1 ,2 ]
Tao, Zhiqiang [3 ]
Xia, Wei [4 ]
Gao, Quanxue [4 ]
Cao, Xiaochun [5 ]
Jiao, Licheng [6 ]
机构
[1] Xidian Univ, Minist Educ Intellisense & Image Understanding, State Key Lab Integrated Serv Networks, Xian 710071, Peoples R China
[2] Xidian Univ, Minist Educ Intellisense & Image Understanding, Key Lab, Xian 710071, Peoples R China
[3] Santa Clara Univ, Dept Comp Sci & Engn, Santa Clara, CA 95053 USA
[4] Xidian Univ, State Key Lab Integrated Serv Networks, Xian 710071, Peoples R China
[5] Sun Yat Sen Univ, Sch Cyber Sci & Technol, Shenzhen Campus, Shenzhen 518107, Peoples R China
[6] Xidian Univ, Sch Artificial Intelligence, Minist Educ Intellisense & Image Understanding, Key Lab, Xian 710071, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Feature extraction; Image reconstruction; Generators; Data models; Clustering algorithms; Training; Representation learning; Adaptive fusion; adversarial training; multiview clustering (MVC); KERNEL;
D O I
10.1109/TNNLS.2022.3145048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The existing deep multiview clustering (MVC) methods are mainly based on autoencoder networks, which seek common latent variables to reconstruct the original input of each view individually. However, due to the view-specific reconstruction loss, it is challenging to extract consistent latent representations over multiple views for clustering. To address this challenge, we propose adversarial MVC (AMvC) networks in this article. The proposed AMvC generates each view's samples conditioning on the fused latent representations among different views to encourage a more consistent clustering structure. Specifically, multiview encoders are used to extract latent descriptions from all the views, and the corresponding generators are used to generate the reconstructed samples. The discriminative networks and the mean squared loss are jointly utilized for training the multiview encoders and generators to balance the distinctness and consistency of each view's latent representation. Moreover, an adaptive fusion layer is developed to obtain a shared latent representation, on which a clustering loss and the ${l_{1,2}}$ -norm constraint are further imposed to improve clustering performance and distinguish the latent space. Experimental results on video, image, and text datasets demonstrate that the effectiveness of our AMvC is over several state-of-the-art deep MVC methods.
引用
收藏
页码:7635 / 7647
页数:13
相关论文
共 64 条
  • [1] Deep Multimodal Subspace Clustering Networks
    Abavisani, Mahdi
    Patel, Vishal M.
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2018, 12 (06) : 1601 - 1614
  • [2] Andrew G., 2013, P 30 INT C MACHINE L, P1247
  • [3] [Anonymous], 2016, ARXIV160201024
  • [4] Asuncion A., 2007, Uci machine learning repository
  • [5] Benton A., 2017, ARXIV170202519
  • [6] Multi-view clustering
    Bickel, S
    Scheffer, T
    [J]. FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 19 - 26
  • [7] Document clustering using locality preserving indexing
    Cai, D
    He, XF
    Han, JW
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (12) : 1624 - 1637
  • [8] Joint stage recognition and anatomical annotation of drosophila gene expression patterns
    Cai, Xiao
    Wang, Hua
    Huang, Heng
    Ding, Chris
    [J]. BIOINFORMATICS, 2012, 28 (12) : I16 - I24
  • [9] Chen MS, 2020, AAAI CONF ARTIF INTE, V34, P3513
  • [10] Multi-view Generative Adversarial Networks
    Chen, Mickael
    Denoyer, Ludovic
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT II, 2017, 10535 : 175 - 188