Kml: A package to cluster longitudinal data

被引:144
作者
Genolini, Christophe [1 ,2 ,3 ,4 ]
Falissard, Bruno [1 ,2 ,3 ,5 ]
机构
[1] INSERM, U669, F-75014 Paris, France
[2] Univ Paris 11, Paris, France
[3] Univ Paris 05, UMR S0669, Paris, France
[4] Univ Paris 10, UMR S0669, Paris, France
[5] Hop Bicetre, AP HP, Psychiat Serv, Le Kremlin Bicetre, France
关键词
Package presentation; Longitudinal data; k-Means; Cluster analysis; Non-parametric algorithm; ALGORITHMS; NUMBER;
D O I
10.1016/j.cmpb.2011.05.008
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Cohort studies are becoming essential tools in epidemiological research. In these studies, measurements are not restricted to single variables but can be seen as trajectories. Thus, an important question concerns the existence of homogeneous patient trajectories. KmL is an R package providing an implementation of k-means designed to work specifically on longitudinal data. It provides several different techniques for dealing with missing values in trajectories (classical ones like linear interpolation or LOCF but also new ones like copyMean). It can run k-means with distances specifically designed for longitudinal data (like Frechet distance or any user-defined distance). Its graphical interface helps the user to choose the appropriate number of clusters when classic criteria are not efficient. It also provides an easy way to export graphical representations of the mean trajectories resulting from the clustering. Finally, it runs the algorithm several times, using various kinds of starting conditions and/or numbers of clusters to be sought, thus sparing the user a lot of manual re-sampling. (C) 2011 Elsevier Ireland Ltd. All rights reserved.
引用
收藏
页码:E112 / E121
页数:10
相关论文
共 34 条