Local optima in K-means clustering:: What you don't know may hurt you

被引:198
作者
Steinley, D [1 ]
机构
[1] Univ Illinois, Dept Psychol, Champaign, IL 61820 USA
关键词
D O I
10.1037/1082-989X.8.3.294
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
The popular K-means clustering method, as implemented in 3 commercial software packages (SPSS, SYSTAT, and SAS), generally provides solutions that are only locally optimal for a given set of data. Because none of these commercial implementations offer a reasonable mechanism to begin the K-means method at alternative starting points, separate routines were written within the MATLAB (MathWorks, 1999) environment that can be initialized randomly (these routines are provided at the end of the online version of this article in the PsycARTICLES database). Through the analysis of 2 empirical data sets and 8 10 simulated data sets, it is shown that the results provided by commercial packages are most likely locally optimal. These results suggest the need for some strategy to study the local optima problem for a specific data set or to identify methods for finding "good" starting values that might lead to the best solutions possible.
引用
收藏
页码:294 / 304
页数:11
相关论文
共 34 条
  • [1] BALAKRISHNAN PV, 1994, PSYCHOMETRIKA, V59, P509
  • [2] A variable-selection heuristic for K-means clustering
    Brusco, MJ
    Cradit, JD
    [J]. PSYCHOMETRIKA, 2001, 66 (02) : 249 - 270
  • [3] Cohen J., 1998, Statistical Power Analysis for the Behavioral Sciences, V2nd
  • [4] REVIEW OF CLASSIFICATION
    CORMACK, RM
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-GENERAL, 1971, 134 : 321 - +
  • [5] Model-based clustering, discriminant analysis, and density estimation
    Fraley, C
    Raftery, AE
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (458) : 611 - 631
  • [6] GOHM C, 1998, THESIS U ILLINOIS UR
  • [7] GRAY T, 1924, ODE DISTANT PROSPECT
  • [8] Hartigan J. A., 1979, Applied Statistics, V28, P100, DOI 10.2307/2346830
  • [9] Hartigan J. A., 1975, CLUSTERING ALGORITHM
  • [10] COMPARING PARTITIONS
    HUBERT, L
    ARABIE, P
    [J]. JOURNAL OF CLASSIFICATION, 1985, 2 (2-3) : 193 - 218