Kml: A package to cluster longitudinal data

被引：144

作者：

Genolini, Christophe ^{[1
,2
,3
,4
]}

Falissard, Bruno ^{[1
,2
,3
,5
]}

机构：

[1] INSERM, U669, F-75014 Paris, France

[2] Univ Paris 11, Paris, France

[3] Univ Paris 05, UMR S0669, Paris, France

[4] Univ Paris 10, UMR S0669, Paris, France

[5] Hop Bicetre, AP HP, Psychiat Serv, Le Kremlin Bicetre, France

来源：

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE | 2011年 / 104卷 / 03期

关键词：

Package presentation; Longitudinal data; k-Means; Cluster analysis; Non-parametric algorithm; ALGORITHMS; NUMBER;

D O I：

10.1016/j.cmpb.2011.05.008

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Cohort studies are becoming essential tools in epidemiological research. In these studies, measurements are not restricted to single variables but can be seen as trajectories. Thus, an important question concerns the existence of homogeneous patient trajectories. KmL is an R package providing an implementation of k-means designed to work specifically on longitudinal data. It provides several different techniques for dealing with missing values in trajectories (classical ones like linear interpolation or LOCF but also new ones like copyMean). It can run k-means with distances specifically designed for longitudinal data (like Frechet distance or any user-defined distance). Its graphical interface helps the user to choose the appropriate number of clusters when classic criteria are not efficient. It also provides an easy way to export graphical representations of the mean trajectories resulting from the clustering. Finally, it runs the algorithm several times, using various kinds of starting conditions and/or numbers of clusters to be sought, thus sparing the user a lot of manual re-sampling. (C) 2011 Elsevier Ireland Ltd. All rights reserved.

引用

页码：E112 / E121

页数：10

共 34 条

[1] Unsupervised curve clustering using B-splines
Abraham, C
Cornillon, PA
Matzner-Lober, E
Molinari, N
[J]. SCANDINAVIAN JOURNAL OF STATISTICS, 2003, 30 (03) : 581 - 595
[2] [Anonymous], DETERMINATION NUMBER
[3] [Anonymous], CAN J MARKET RES
[4] A comparison of maximum covariance and k-means cluster analysis in classifying cases into known taxon groups
Beauchaine, TP
Beauchaine, RJ
[J]. PSYCHOLOGICAL METHODS, 2002, 7 (02) : 245 - 261
[5] Calinski T., 1974, Communications in Statistics-theory and Methods, V3, P1, DOI [10.1080/03610927408827101, DOI 10.1080/03610927408827101]
[6] A CLASSIFICATION EM ALGORITHM FOR CLUSTERING AND 2 STOCHASTIC VERSIONS
CELEUX, G
GOVAERT, G
[J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1992, 14 (03) : 315 - 332
[7] Adaptive dissimilarity index for measuring time series proximity
Chouakria, Ahlame Douzal
Nagabhushan, Panduranga Naidu
[J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2007, 1 (01) : 5 - 21
[8] Fuzzy C-means clustering models for multivariate time-varying data: Different approaches
D'urso, P
[J]. INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2004, 12 (03) : 287 - 326
[9] CLUSTER SEPARATION MEASURE
DAVIES, DL
BOULDIN, DW
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) : 224 - 227
[10] Everitt B. S., 2001, CLUSTER ANAL

← 1 2 3 4 →