Recently, different studies have demonstrated the use of co-clustering, a data mining technique which simultaneously produces row-clusters of observations and column-clusters of features. The present work introduces a novel co-clustering model to easily summarize textual data in a document-term format. In addition to highlighting homogeneous co-clusters as other existing algorithms do we also distinguish noisy co-clusters from significant co-clusters, which is particularly useful for sparse document-term matrices. Furthermore, our model proposes a structure among the significant co-clusters, thus providing improved interpretability to users. The approach proposed contends with state-of-the-art methods for document and term clustering and offers user-friendly results. The model relies on the Poisson distribution and on a constrained version of the Latent Block Model, which is a probabilistic approach for co-clustering. A Stochastic Expectation-Maximization algorithm is proposed to run the model's inference as well as a model selection criterion to choose the number of co-clusters. Both simulated and real data sets illustrate the efficiency of this model by its ability to easily identify relevant co-clusters. (C) 2020 Elsevier Ltd. All rights reserved.
机构:
Univ Paris 05, Lab MAP5, UMR CNRS 8145, Paris, France
Univ Luxembourg, 162a Ave Faiencerie, L-1511 Luxembourg, LuxembourgUniv Paris 05, Lab MAP5, UMR CNRS 8145, Paris, France
Berge, Laurent R.
Bouveyron, Charles
论文数: 0引用数: 0
h-index: 0
机构:
Univ Cote dAzur, Lab JA Dieudonne, UMR CNRS 7351, Nice, France
INRIA Sophia Antipolis, Epione, Valbonne, FranceUniv Paris 05, Lab MAP5, UMR CNRS 8145, Paris, France
Bouveyron, Charles
Corneli, Marco
论文数: 0引用数: 0
h-index: 0
机构:
Univ Cote dAzur, Lab JA Dieudonne, UMR CNRS 7351, Nice, France
Off 4S813, Lab JA Dieudonne, Campus Valrose, F-06108 Nice, FranceUniv Paris 05, Lab MAP5, UMR CNRS 8145, Paris, France
Corneli, Marco
Latouche, Pierre
论文数: 0引用数: 0
h-index: 0
机构:
Univ Paris 05, Lab MAP5, UMR CNRS 8145, Paris, France
Univ Paris 1 Pantheon Sorbonne, EA 4543, Lab SAMM, Paris, FranceUniv Paris 05, Lab MAP5, UMR CNRS 8145, Paris, France
机构:
Lab ERIC, 5 Ave Pierre Mendes France, F-69500 Bron, France
Univ Lumiere Lyon 2, 86 Rue Pasteur, F-69007 Lyon, FranceLab ERIC, 5 Ave Pierre Mendes France, F-69500 Bron, France
Selosse, Margot
Jacques, Julien
论文数: 0引用数: 0
h-index: 0
机构:
Lab ERIC, 5 Ave Pierre Mendes France, F-69500 Bron, France
Univ Lumiere Lyon 2, 86 Rue Pasteur, F-69007 Lyon, FranceLab ERIC, 5 Ave Pierre Mendes France, F-69500 Bron, France