Fitting semiparametric clustering models to dissimilarity data

被引:0
作者
Maurizio Vichi
机构
[1] “Sapienza”,Department of Statistics, Probability and Applied Statistics
[2] University of Rome,undefined
来源
Advances in Data Analysis and Classification | 2008年 / 2卷
关键词
Cluster analysis; Partitions; Semiparametric clustering models; LS-estimation; 62h30;
D O I
暂无
中图分类号
学科分类号
摘要
The cluster analysis problem of partitioning a set of objects from dissimilarity data is here handled with the statistical model-based approach of fitting the “closest” classification matrix to the observed dissimilarities. A classification matrix represents a clustering structure expressed in terms of dissimilarities. In cluster analysis there is a lack of methodologies widely used to directly partition a set of objects from dissimilarity data. In real applications, a hierarchical clustering algorithm is applied on dissimilarities and subsequently a partition is chosen by visual inspection of the dendrogram. Alternatively, a “tandem analysis” is used by first applying a Multidimensional Scaling (MDS) algorithm and then by using a partitioning algorithm such as k-means applied on the dimensions specified by the MDS. However, neither the hierarchical clustering algorithms nor the tandem analysis is specifically defined to solve the statistical problem of fitting the closest partition to the observed dissimilarities. This lack of appropriate methodologies motivates this paper, in particular, the introduction and the study of three new object partitioning models for dissimilarity data, their estimation via least-squares and the introduction of three new fast algorithms.
引用
收藏
相关论文
共 31 条
  • [1] Ball GH(1967)A clustering technique for summarizing multivariate data Behav Sci 12 153-5
  • [2] Hall DJ(1990)Recognition of the tree metrics SIAM J Discrete Math 3 1-6
  • [3] Bandelt HJ(1966)The scree test for the number of factors Multivar Behav Res 1 245-276
  • [4] Cattell RB(1980)Construction de 1’ultramétrique la plus proche d’une dissimilarité an sens des moindres carrés Recherche Operationelle Operations Research 14 157-70
  • [5] Chandon JL(1971)A review of classification (with discussion) J R Stat Soc A 134 321-67
  • [6] Lemaire J(1993)Fast parallel recognition of ultrametrics and tree metrics SIAM J Discrete Math 6 523-532
  • [7] Pouget J(1984)A least squares algorithm for fitting an ultrametric tree to a dissimilarity matrix Pattern Recognit Lett 2 133-7
  • [8] Cormack RM(1971)Admissible clustering procedures Biometrika 58 91-104
  • [9] Dahlhaus E(2002)Model based clustering, discriminant analysis and density estimation J Am Stat Assoc 97 611-631
  • [10] De Soete G(1987)Parsimonious trees J Classif 4 85-101