Graph modularity maximization as an effective method for co-clustering text data

被引:24
作者
Ailem, Melissa [1 ]
Role, Francois [1 ]
Nadif, Mohamed [1 ]
机构
[1] Univ Paris 05, Sorbonne Paris Cite, LIPADE, F-75006 Paris, France
关键词
Co-clustering; Modularity;
D O I
10.1016/j.knosys.2016.07.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we show how the modularity measure can serve as a useful criterion for co-clustering document-term matrices. We present and investigate the performance of CoClus, a novel, effective block diagonal co-clustering algorithm which directly maximizes this modularity measure. The maximization is performed using an iterative alternating optimization procedure, in contrast to algorithms that use spectral relaxations of the discrete optimization problems. Extensive comparative experiments performed on various document-term datasets demonstrate that this approach is very effective, stable, and outperforms other block-diagonal co-clustering algorithms, devoted to the same task. Another important advantage of using modularity in the co-clustering context is that it provides a novel, simple way of determining the appropriate number of co-clusters. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:160 / 173
页数:14
相关论文
共 49 条
[1]  
[Anonymous], 2015, P 24 ACM INT C INF K, DOI DOI 10.1145/2806416.2806639
[2]  
[Anonymous], 2013, BENCHMARKING TEXT CO
[3]  
[Anonymous], STAT METHODS APPL
[4]  
[Anonymous], LECT NOTES ARTIF III
[5]  
[Anonymous], APPL STOCH MODELS DA
[6]  
[Anonymous], IJCAI
[7]  
[Anonymous], P IEEE ICIP 2006
[8]  
[Anonymous], LECT NOTES COMPUT SC
[9]  
[Anonymous], 2005, Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 2005, Chicago, Illinois, DOI DOI 10.1145/1081870.1081894
[10]   Frequency-sensitive competitive learning for scalable balanced clustering on high-dimensional hyperspheres [J].
Banerjee, A ;
Ghosh, J .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2004, 15 (03) :702-719