Penalized model-based clustering of complex functional data

被引：2

作者：

Pronello, Nicola ^{[1
]}

Ignaccolo, Rosaria ^{[2
]}

Ippoliti, Luigi ^{[3
]}

Fontanella, Sara ^{[4
]}

机构：

[1] Univ G dAnnunzio, Dept Neurosci Imaging & Clin Sci, Pescara, Italy

[2] Univ Turin, Dept Econ & Stat Cognetti de Martiis, Turin, Italy

[3] Univ G dAnnunzio, Dept Econ, Pescara, Italy

[4] Imperial Coll London, Natl Heart & Lung Inst, London, England

来源：

STATISTICS AND COMPUTING | 2023年 / 33卷 / 06期

关键词：

Functional zoning; Manifold data; Mixture models; Shape analysis; Spatial clustering; Surface data; CLASSIFICATION; REGRESSION; DIFFUSION; SPLINES; CURVES;

D O I：

10.1007/s11222-023-10288-2

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

High dimensional data, large-scale data, imaging and manifold data are all fostering new frontiers of statistics. These type of data are commonly considered in Functional Data Analysis where they are viewed as infinite-dimensional random vectors in a functional space. The rapid development of new technologies has generated a flow of complex data that have led to the development of new modeling strategies by scientists. In this paper, we basically deal with the problem of clustering a set of complex functional data into homogeneous groups. Working in a mixture model-based framework, we develop a flexible clustering technique achieving dimensionality reduction schemes through an L1 penalization. The proposed procedure results in an integrated modelling approach where shrinkage techniques are applied to enable sparse solutions in both the means and the covariance matrices of the mixture components, while preserving the underlying clustering structure. This leads to an entirely data-driven methodology suitable for simultaneous dimensionality reduction and clustering. The proposed methodology is evaluated through a Monte Carlo simulation study and an empirical analysis of real-world datasets showing different degrees of complexity.

引用

页数：20

共 50 条

[41] Model-Based Clustering of Mixed Data With Sparse Dependence
Choi, Young-Geun
Ahn, Soohyun
Kim, Jayoun
IEEE ACCESS, 2023, 11 : 75945 - 75954
[42] Model-based clustering of Gaussian copulas for mixed data
Marbac, Matthieu
Biernacki, Christophe
Vandewalle, Vincent
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2017, 46 (23) : 11635 - 11656
[43] Model-based clustering for multivariate partial ranking data
Jacques, Julien
Biernacki, Christophe
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2014, 149 : 201 - 217
[44] Probabilistic model-based clustering of multivariate and sequential data
Smyth, P
ARTIFICIAL INTELLIGENCE AND STATISTICS 99, PROCEEDINGS, 1999, : 299 - 304
[45] Scalable model-based clustering by working on data summaries
Jin, HD
Wong, ML
Leung, KS
THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2003, : 91 - 98
[46] Model-based co-clustering for ordinal data
Jacques, Julien
Biernacki, Christophe
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2018, 123 : 101 - 115
[47] Model-Based Clustering of Inhomogeneous Paired Comparison Data
Busse, Ludwig M.
Buhmann, Joachim M.
SIMILARITY-BASED PATTERN RECOGNITION, 2011, 7005 : 207 - 221
[48] On Model-Based Clustering of Directional Data with Heavy Tails
Zhang, Yingying
Melnykov, Volodymyr
Melnykov, Igor
JOURNAL OF CLASSIFICATION, 2023, 40 (03) : 527 - 551
[49] Bayesian model-based clustering for longitudinal ordinal data
Costilla, Roy
Liu, Ivy
Arnold, Richard
Fernandez, Daniel
COMPUTATIONAL STATISTICS, 2019, 34 (03) : 1015 - 1038
[50] Model-Based Clustering
Paul D. McNicholas
Journal of Classification, 2016, 33 : 331 - 373

← 1 2 3 4 5 →