A Unified Multi-view Clustering Algorithm Using Multi-objective Optimization Coupled with Generative Model

被引:15
作者
Mitra, Sayantan [1 ]
Hasanuzzaman, Mohammed [2 ]
Saha, Sriparna [1 ]
机构
[1] Indian Inst Technol Patna, Bihta 801103, Bihar, India
[2] Dublin City Univ, ADAPT Ctr, Sch Comp, Dublin, Ireland
关键词
Multi-objective clustering; multi-view clustering; generative model; search result clustering; PIXEL CLASSIFICATION;
D O I
10.1145/3365673
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
There is a large body of works on multi-view clustering that exploit multiple representations (or views) of the same input data for better convergence. These multiple views can come from multiple modalities (image, audio, text) or different feature subsets. Obtaining one consensus partitioning after considering different views is usually a non-trivial task. Recently, multi-objective based multi-view clustering methods have suppressed the performance of single objective based multi-view clustering techniques. One key problem is that it is difficult to select a single solution from a set of alternative partitionings generated by multi-objective techniques on the final Pareto optimal front. In this article, we propose a novel multi-objective based multi-view clustering framework that overcomes the problem of selecting a single solution in multi-objective based techniques. In particular, our proposed framework has three major components as follows: (i) multi-view based multi-objective algorithm, Multiview-AMOSA, for initial clustering of data points; (ii) a generative model for generating a combined solution having probabilistic labels; and (iii) K-means algorithm for obtaining the final labels. As the first component, we have adopted a recently developed multi-view based multi-objective clustering algorithm to generate different possible consensus partitionings of a given dataset taking into account different views. A generative model is coupled with the first component to generate a single consensus partitioning after considering multiple solutions. It exploits the latent subsets of the non-dominated solutions obtained from the multi-objective clustering algorithm and combines them to produce a single probabilistic labeled solution. Finally, a simple clustering algorithm, namely K-means, is applied on the generated probabilistic labels to obtain the final cluster labels. Experimental validation of our proposed framework is carried out over several benchmark datasets belonging to three different domains; UCI datasets, multi-view datasets, search result clustering datasets, and patient stratification datasets. Experimental results show that our proposed framework achieves an improvement of around 2%-4% over different evaluation metrics in all the four domains in comparison to state-of-the art methods.
引用
收藏
页数:31
相关论文
共 100 条
[1]  
Acharya S., 2014, P 25 INT C COMP LING, P99
[2]   A comparison of extrinsic clustering evaluation metrics based on formal constraints [J].
Amigo, Enrique ;
Gonzalo, Julio ;
Artiles, Javier ;
Verdejo, Felisa .
INFORMATION RETRIEVAL, 2009, 12 (04) :461-486
[3]  
[Anonymous], 2011, INT C MACHINE LEARNI
[4]  
[Anonymous], 2014, P 28 AAAI C ART INT
[5]  
[Anonymous], 2013, P AAAI
[6]  
[Anonymous], 2008, P 2008 SIAM INT C DA
[7]  
[Anonymous], IEEE T EVOLUTIONARY
[8]  
[Anonymous], 1997, THESIS
[9]  
[Anonymous], 2002, J. Mach. Learn. Res
[10]  
[Anonymous], 2015, P 4 INT C LEARN REPR