UNIVARIATE SCREENING MEASURES FOR CLUSTER-ANALYSIS

被引:10
作者
DONOGHUE, JR
机构
[1] Educational Testing Service, Princeton, NJ 08541
关键词
D O I
10.1207/s15327906mbr3003_5
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Inclusion of irrelevant variables in a cluster analysis adversely affects subgroup recovery. This article examines using moment-based statistics to screen variables; only variables which pass the screening are then used in clustering. Normal mixtures are analytically shown often to possess negative kurtosis. Two related measures, m and coefficient of bimodality b, are also examined. A Monte Carlo study compared the screening measures to no selection, De Soete's (1988) ultrametric weights, and Fowlkes, Gnanadesikan, and Kettenring's (1988) forward selection procedure. Screening based on kurtosis degraded recovery and is not recommended. In contrast, screening on m or on b improved recovery over both no selection and forward selection, and screening performed as well as ultrametric weights. Combining screening with ultrametric weights performed extremely well. All methods were found to be somewhat sensitive to other types of error. Screening variables appears a viable alternative to both ultrametric weights and forward selection. The potential advantages and disadvantages of screening are considered.
引用
收藏
页码:385 / 427
页数:43
相关论文
共 51 条