Model-based co-clustering for functional data

被引:30
作者
Ben Slimen, Yosra [1 ,2 ]
Allio, Sylvain [1 ]
Jacques, Julien [2 ]
机构
[1] Orange Labs, Belfort, France
[2] Univ Lyon, Univ Lyon 2, ERIC EA3083, Lyon, France
关键词
Co-clustering; Functional data; SEM-Gibbs algorithm; Latent block model; ICL-BIC criterion; Mobile network; Key performance indicators; APPROXIMATION; DENSITY;
D O I
10.1016/j.neucom.2018.02.055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to provide a simplified representation of key performance indicators for an easier analysis by mobile network maintainers, a model-based co-clustering algorithm for functional data is proposed. Co-clustering aims to identify block patterns in a data set from a simultaneous clustering of rows and columns. The algorithm relies on the latent block model in which each curve is identified by its functional principal components that are modeled by a multivariate Gaussian distribution whose parameters are block-specific. These latter are estimated by a stochastic EM algorithm embedding a Gibbs sampling. In order to select the numbers of row-and column-clusters, an ICL-BIC criterion is introduced. In addition to be the first co-clustering algorithm for functional data, the advantage of the proposed model is its ability to extract the hidden double structure induced by the data and its ability to deal with missing values. The model has proven its efficiency on simulated data and on a real data application that helps to optimize the topology of 4G mobile networks. (c) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:97 / 108
页数:12
相关论文
共 31 条
[1]  
[Anonymous], 2012, Ph.D. thesis
[2]  
Banerjee A, 2007, J MACH LEARN RES, V8, P1919
[3]   MODEL-BASED GAUSSIAN AND NON-GAUSSIAN CLUSTERING [J].
BANFIELD, JD ;
RAFTERY, AE .
BIOMETRICS, 1993, 49 (03) :803-821
[4]  
Bennett J., 2007, P KDD CUP WORKSH CON
[5]   Model-based clustering of time series in group-specific functional subspaces [J].
Bouveyron, Charles ;
Jacques, Julien .
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2011, 5 (04) :281-300
[6]  
Brault V, 2015, J SFDS, V156, P27
[7]   GAUSSIAN PARSIMONIOUS CLUSTERING MODELS [J].
CELEUX, G ;
GOVAERT, G .
PATTERN RECOGNITION, 1995, 28 (05) :781-793
[8]   Functional clustering and identifying substructures of longitudinal data [J].
Chiou, Jeng-Min ;
Li, Pai-Ling .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2007, 69 :679-699
[9]   DEFINING PROBABILITY DENSITY FOR A DISTRIBUTION OF RANDOM FUNCTIONS [J].
Delaigle, Aurore ;
Hall, Peter .
ANNALS OF STATISTICS, 2010, 38 (02) :1171-1193
[10]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38