Clustering by response: CBR

被引:4
作者
Hecker, H [1 ]
Wubbelt, P [1 ]
机构
[1] HANNOVER MED SCH,INST BIOMETRIE 8410,D-30623 HANNOVER,GERMANY
关键词
CART; conceptual clustering; RECPAM; tree analysis;
D O I
10.1016/S0167-9473(96)00061-8
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The analysis of relations between ''independent'' (predictor-) and ''dependent'' (response-) variables may be regarded as a classification problem if it is the aim of a study to identify subgroups of subjects with markedly different distributions of the response variable. In this paper, a class of models is presented in which both aspects of this aim are recognized: first, with respect to the response, the homogeneity within the diversity between the classes, and second with respect to the predictor variables, the characterization of each class as an easily interpretable (connected) subset of the space of all multidimensional predictor values. The P-value of a suitable statistic to test the equality of the distribution of the response variable between two or more subgroups is used as a measure of diversity between the classes. The models are restricted to binary predictors, but the response variable may be uni- or multivariate, qualitative, quantitative or right-censored. The complexity of the model can be defined by the user, and for a specified model, estimation of the ''true classification rule'' and permutation tests on additional hypotheses are derived. A description of a computer program that has been developed to perform all estimation and test procedures described, as well as an illustrating application, is added. Some connections to related topics like tree analysis and conceptual clustering are drawn.
引用
收藏
页码:193 / 215
页数:23
相关论文
共 14 条
[1]  
[Anonymous], 1947, SEQUENTIAL ANAL
[2]  
Breiman L., 1984, Classification and Regression Trees, DOI DOI 10.2307/2530946
[3]   GENERALIZED REGRESSION TREES [J].
CIAMPI, A .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1991, 12 (01) :57-78
[4]   SOME METHODS FOR STRENGTHENING THE COMMON X2 TESTS [J].
COCHRAN, WG .
BIOMETRICS, 1954, 10 (04) :417-451
[5]  
DIECKMANN F, 1995, THESIS MEDIZINISCHE
[6]  
DIXON WJ, BMDP STATISTICAL SOF
[7]  
FISCHER D, 1986, ARTIF INTELL, P77
[8]   ANALYSIS OF CATEGORICAL DATA BY LINEAR MODELS [J].
GRIZZLE, JE ;
STARMER, CF ;
KOCH, GG .
BIOMETRICS, 1969, 25 (03) :489-&
[9]  
GUENOCHE A, 1991, SYMBOLIC-NUMERIC DATA ANALYSIS AND LEARNING, P335
[10]  
Hecker H., 1988, Expert Systems and Decision Support in Medicine. 33rd Annual Meeting of the GMDS EFMI Special Topic Meeting. Peter L. Reichertz Memorial Conference, P155