Sparse and smooth functional data clustering

被引:11
作者
Centofanti, Fabio [1 ]
Lepore, Antonio [1 ]
Palumbo, Biagio [1 ]
机构
[1] Univ Naples Federico II, Dept Ind Engn, Piazzale Tecchio 80, I-80125 Naples, Italy
关键词
Functional data analysis; Functional clustering; Model-based clustering; Penalized likelihood; Sparse clustering; VARIABLE SELECTION; LIKELIHOOD;
D O I
10.1007/s00362-023-01408-1
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A new model-based procedure is developed for sparse clustering of functional data that aims to classify a sample of curves into homogeneous groups while jointly detecting the most informative portions of the domain. The proposed method is referred to as sparse and smooth functional clustering (SaS-Funclust) and relies on a general functional Gaussian mixture model whose parameters are estimated by maximizing a log-likelihood function penalized with a functional adaptive pairwise fusion penalty and a roughness penalty. The former allows identifying the noninformative portion of the domain by shrinking the means of separated clusters to some common values, whereas the latter improves the interpretability by imposing some degree of smoothing to the estimated cluster means. The model is estimated via an expectation-conditional maximization algorithm paired with a cross-validation procedure. Through a Monte Carlo simulation study, the SaS-Funclust method is shown to outperform other methods that already appeared in the literature, both in terms of clustering performance and interpretability. Finally, three real-data examples are presented to demonstrate the favourable performance of the proposed method. The SaS-Funclust method is implemented in the R package sasfunclust, available on CRAN.
引用
收藏
页码:795 / 825
页数:31
相关论文
共 50 条
[1]   Unsupervised curve clustering using B-splines [J].
Abraham, C ;
Cornillon, PA ;
Matzner-Lober, E ;
Molinari, N .
SCANDINAVIAN JOURNAL OF STATISTICS, 2003, 30 (03) :581-595
[2]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[3]   Model-based clustering of time series in group-specific functional subspaces [J].
Bouveyron, Charles ;
Jacques, Julien .
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2011, 5 (04) :281-300
[4]   Functional clustering methods for resistance spot welding process data in the automotive industry [J].
Capezza, Christian ;
Centofanti, Fabio ;
Lepore, Antonio ;
Palumbo, Biagio .
APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2021, 37 (05) :908-925
[5]   Smooth LASSO estimator for the Function-on-Function linear regression model [J].
Centofanti, Fabio ;
Fontana, Matteo ;
Lepore, Antonio ;
Vantini, Simone .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2022, 176
[6]  
Charrad M, 2014, J STAT SOFTW, V61, P1
[7]   Optimally Weighted L2 Distance for Functional Data [J].
Chen, Huaihou ;
Reiss, Philip T. ;
Tarpey, Thaddeus .
BIOMETRICS, 2014, 70 (03) :516-525
[8]   Functional clustering and identifying substructures of longitudinal data [J].
Chiou, Jeng-Min ;
Li, Pai-Ling .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2007, 69 :679-699
[9]   Probabilistic K-means with Local Alignment for Clustering and Motif Discovery in Functional Data [J].
Cremona, Marzia A. A. ;
Chiaromonte, Francesca .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2023, 32 (03) :1119-1130
[10]  
de Boor Carl., 1978, A Practical Guide to Splines, DOI [10.1007/978-1-4612-6333-3, DOI 10.1007/978-1-4612-6333-3, 10.2307/2006241]