A Comparison of Heuristic Procedures for Minimum Within-Cluster Sums of Squares Partitioning

被引:1
|
作者
Michael J. Brusco
Douglas Steinley
机构
[1] Florida State University,Department of Marketing, College of Business
来源
Psychometrika | 2007年 / 72卷
关键词
combinatorial data analysis; cluster analysis; heuristics; sum of squares criterion;
D O I
暂无
中图分类号
学科分类号
摘要
Perhaps the most common criterion for partitioning a data set is the minimization of the within-cluster sums of squared deviation from cluster centroids. Although optimal solution procedures for within-cluster sums of squares (WCSS) partitioning are computationally feasible for small data sets, heuristic procedures are required for most practical applications in the behavioral sciences. We compared the performances of nine prominent heuristic procedures for WCSS partitioning across 324 simulated data sets representative of a broad spectrum of test conditions. Performance comparisons focused on both percentage deviation from the “best-found” WCSS values, as well as recovery of true cluster structure. A real-coded genetic algorithm and variable neighborhood search heuristic were the most effective methods; however, a straightforward two-stage heuristic algorithm, HK-means, also yielded exceptional performance. A follow-up experiment using 13 empirical data sets from the clustering literature generally supported the results of the experiment using simulated data. Our findings have important implications for behavioral science researchers, whose theoretical conclusions could be adversely affected by poor algorithmic performances.
引用
收藏
页码:583 / 600
页数:17
相关论文
共 5 条