WEIGHTING AND SELECTION OF VARIABLES FOR CLUSTER-ANALYSIS

被引:104
作者
GNANADESIKAN, R
KETTENRING, JR
TSAO, SL
机构
[1] BELLCORE,MORRISTOWN,NJ 07960
[2] AT&T BELL LABS,HOLMDEL,NJ 07733
关键词
CLUSTERING; VARIABLE SELECTION; FEATURE SELECTION; VARIABLE WEIGHTING; VARIABLE IMPORTANCE; PATTERN RECOGNITION; DISCRIMINANT ANALYSIS;
D O I
10.1007/BF01202271
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
One of the thorniest aspects of cluster analysis continues to be the weighting and selection of variables. This paper reports on the performance of nine methods on eight ''leading case'' simulated and real sets of data. The results demonstrate shortcomings of weighting based on the standard deviation or range as well as other more complex schemes in the literature. Weighting schemes based upon carefully chosen estimates of within-cluster and between-cluster variability are generally more effective. These estimates do not require knowledge of the cluster structure. Additional research is essential: worry-free approaches do not yet exist.
引用
收藏
页码:113 / 136
页数:24
相关论文
共 32 条
[1]  
Andrews D. F., 1985, DATA COLLECTION PROB
[2]  
ART D, 1982, UTILITAS MATHEMATICA, V21, P75
[3]  
Batchelor BG, 1978, PATTERN RECOGNITION
[4]   SYNTHESIZED CLUSTERING - A METHOD FOR AMALGAMATING ALTERNATIVE CLUSTERING BASES WITH DIFFERENTIAL WEIGHTING OF VARIABLES [J].
DESARBO, WS ;
CARROLL, JD ;
CLARK, LA ;
GREEN, PE .
PSYCHOMETRIKA, 1984, 49 (01) :57-78
[5]  
DESOETE G, 1985, J CLASSIF, V2, P173
[6]   OPTIMAL VARIABLE WEIGHTING FOR ULTRAMETRIC AND ADDITIVE TREE CLUSTERING [J].
DESOETE, G .
QUALITY & QUANTITY, 1986, 20 (2-3) :169-180
[8]   A PERMUTATION-BASED ALGORITHM FOR BLOCK CLUSTERING [J].
DUFFY, DE ;
QUIROZ, AJ .
JOURNAL OF CLASSIFICATION, 1991, 8 (01) :65-91
[9]   MULTIVARIATE ANALYSIS AND AGRICULTURAL EXPERIMENTS [J].
FINNEY, DJ .
BIOMETRICS, 1956, 12 (01) :67-71
[10]   VARIABLE SELECTION IN CLUSTERING [J].
FOWLKES, EB ;
GNANADESIKAN, R ;
KETTENRING, JR .
JOURNAL OF CLASSIFICATION, 1988, 5 (02) :205-228