Clustering Longitudinal Data: A Review of Methods and Software Packages

被引:1
作者
Lu, Zihang [1 ,2 ]
机构
[1] Queens Univ, Dept Publ Hlth Sci, Kingston, ON, Canada
[2] Queens Univ, Dept Math & Stat, Kingston, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
cluster analysis; longitudinal data; model-based clustering; algorithm-based clustering; functional clustering; FUNCTIONAL DATA-ANALYSIS; LATENT CLASS ANALYSIS; MIXTURE-MODELS; K-MEANS; R PACKAGE; BAYESIAN-INFERENCE; CROSS-VALIDATION; UNKNOWN NUMBER; MIXED MODELS; MISSING DATA;
D O I
10.1111/insr.12588
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Clustering of longitudinal data is becoming increasingly popular in many fields such as social sciences, business, environmental science, medicine and healthcare. However, it is often challenging due to the complex nature of the data, such as dependencies between observations collected over time, missingness, sparsity and non-linearity, making it difficult to identify meaningful patterns and relationships among the data. Despite the increasingly common application of cluster analysis for longitudinal data, many existing methods are still less known to researchers, and limited guidance is provided in choosing between methods and software packages. In this paper, we review several commonly used methods for clustering longitudinal data. These methods are broadly classified into three categories, namely, model-based approaches, algorithm-based approaches and functional clustering approaches. We perform a comparison among these methods and their corresponding R software packages using real-life datasets and simulated datasets under various conditions. Findings from the analyses and recommendations for using these approaches in practice are discussed.
引用
收藏
页数:34
相关论文
共 50 条
[41]   Sparse and smooth functional data clustering [J].
Centofanti, Fabio ;
Lepore, Antonio ;
Palumbo, Biagio .
STATISTICAL PAPERS, 2024, 65 (02) :795-825
[42]   Error-rate estimation in discriminant analysis of non-linear longitudinal data: A comparison of resampling methods [J].
de la Cruz, Rolando ;
Fuentes, Claudio ;
Meza, Cristian ;
Nunez-Anton, Vicente .
STATISTICAL METHODS IN MEDICAL RESEARCH, 2018, 27 (04) :1153-1167
[43]   Functional clustering methods for resistance spot welding process data in the automotive industry [J].
Capezza, Christian ;
Centofanti, Fabio ;
Lepore, Antonio ;
Palumbo, Biagio .
APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2021, 37 (05) :908-925
[44]   clusterMLD: An Efficient Hierarchical Clustering Method for Multivariate Longitudinal Data [J].
Zhou, Junyi ;
Zhang, Ying ;
Tu, Wanzhu .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2023, 32 (03) :1131-1144
[45]   Federated Fuzzy Clustering for Decentralized Incomplete Longitudinal Behavioral Data [J].
Ngo, Hieu ;
Fang, Hua ;
Rumbut, Joshua ;
Wang, Honggang .
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (08) :14657-14670
[46]   Balanced longitudinal data clustering with a copula kernel mixture model [J].
Zhang, Xi ;
Murphy, Orla A. ;
Mcnicholas, Paul D. .
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2025, 53 (01)
[47]   Clustering Longitudinal Data Using R: A Monte Carlo Study [J].
Verboon, Peter ;
Pat-El, Ron .
METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES, 2022, 18 (02) :144-163
[48]   Bayesian model-based clustering for longitudinal ordinal data [J].
Costilla, Roy ;
Liu, Ivy ;
Arnold, Richard ;
Fernandez, Daniel .
COMPUTATIONAL STATISTICS, 2019, 34 (03) :1015-1038
[49]   Separate and joint modeling of longitudinal and event time data using standard computer packages [J].
Guo, X ;
Carlin, BP .
AMERICAN STATISTICIAN, 2004, 58 (01) :16-24
[50]   Software Packages for Bayesian Multilevel Modeling [J].
Mai, Yujiao ;
Zhang, Zhiyong .
STRUCTURAL EQUATION MODELING-A MULTIDISCIPLINARY JOURNAL, 2018, 25 (04) :650-658