Robust unsupervised image categorization based on variational autoencoder with disentangled latent representations

被引:6
作者
Yang, Lin [1 ]
Fan, Wentao [1 ]
Bouguila, Nizar [2 ]
机构
[1] Huaqiao Univ, Dept Comp Sci & Technol, Xiamen, Peoples R China
[2] Concordia Univ, Concordia Inst Informat Syst Engn CIISE, Montreal, PQ, Canada
基金
中国国家自然科学基金;
关键词
Clustering; Variational autoencoder (VAE); Disentangled latent representations; Robust training; Mixture model; Student's-t distribution;
D O I
10.1016/j.knosys.2022.108671
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, deep generative models have been successfully applied to unsupervised clustering analyses, due to the model capabilities for learning good representations of the input data from a lower dimensional latent space. In this work, we propose a robust deep generative clustering method based on a variational autoencoder (VAE) for unsupervised image categorization. The merits of our method can be summarized as follows. First, each latent representation generated by the encoder is disentangled into the cluster representation and generation representation, where the cluster representation is responsible for preserving the clustering information, while the generation representation is responsible for conserving the generation information. Thus, by only utilizing the cluster representation, we can improve the performance and efficiency of clustering tasks without interference from generating tasks. Second, a Student's-t mixture model is adopted as the prior over the cluster representation to enhance the robustness of our method against clustering outliers. Third, we propose a biaugmentation module to promote the training stability for our model. In contrast with most of the existing deep generative clustering methods that require a pretraining step to stabilize the training process, our model is able to provide a stable training process through feature disentanglement and data augmentation. We validate the proposed robust deep generative clustering method through extensive experiments by comparing it with state-of-the-art methods on unsupervised image categorization. (C) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:9
相关论文
共 43 条
[1]  
[Anonymous], 2017, INT C LEARN REPR
[2]  
[Anonymous], 2014, ICLR
[3]  
Bishop C. M, 2006, PATTERN RECOGN
[4]   Variational Inference for Dirichlet Process Mixtures [J].
Blei, David M. ;
Jordan, Michael I. .
BAYESIAN ANALYSIS, 2006, 1 (01) :121-143
[5]  
Cao L, 2020, ARXIV PREPRINT ARXIV
[6]   Learning Component-Level Sparse Representation for Image and Video Categorization [J].
Chiang, Chen-Kuo ;
Liu, Chao-Hsien ;
Duan, Chih-Hsueh ;
Lai, Shang-Hong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 22 (12) :4775-4787
[7]   Deep Clustering via Joint Convolutional Autoencoder Embedding and Relative Entropy Minimization [J].
Dizaji, Kamran Ghasedi ;
Herandi, Amirhossein ;
Deng, Cheng ;
Cai, Weidong ;
Huang, Heng .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5747-5756
[8]   Learning Category-Specific Dictionary and Shared Dictionary for Fine-Grained Image Categorization [J].
Gao, Shenghua ;
Tsang, Ivor Wai-Hung ;
Ma, Yi .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (02) :623-634
[9]   Dual Adversarial Autoencoders for Clustering [J].
Ge, Pengfei ;
Ren, Chuan-Xian ;
Dai, Dao-Qing ;
Feng, Jiashi ;
Yan, Shuicheng .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (04) :1417-1424
[10]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672