Estimating the cluster tree of a density by analyzing the minimal spanning tree of a sample

被引:117
作者
Stuetzle, W [1 ]
机构
[1] Univ Washington, Dept Stat, Seattle, WA 98195 USA
关键词
two-way; two-mode data; nearest neighbor density estimation; single linkage clustering; runt test; mixture models;
D O I
10.1007/s00357-003-0004-6
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
We present runt pruning, a new clustering method that attempts to find modes of a density by analyzing the minimal spanning tree of a sample. The method exploits the connection between the minimal spanning tree and nearest neighbor density estimation. It does not rely on assumptions about the specific form of the data density (e.g., normal mixture) or about the geometric shapes of the clusters, and is computationally feasible for large data sets.
引用
收藏
页码:25 / 47
页数:23
相关论文
共 31 条
[1]  
Ankerst M., 1999, SIGMOD Record, V28, P49, DOI 10.1145/304181.304187
[2]  
Ballard D.H., 1982, Computer Vision
[3]  
BENTLEY JL, 1978, IEEE T COMPUT, V27, P97, DOI 10.1109/TC.1978.1675043
[4]  
CUTTING DR, 1992, SIGIR 92 : PROCEEDINGS OF THE FIFTEENTH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P318
[5]  
CUTTING DR, 1993, P 16 ANN INT ACM SIG, P126
[6]  
Ester M., 1996, Proc. Second Int. Conf. Knowl. Discov. Data Min, P226, DOI DOI 10.5555/3001460.3001507
[7]   How many clusters? Which clustering method? Answers via model-based cluster analysis [J].
Fraley, C ;
Raftery, AE .
COMPUTER JOURNAL, 1998, 41 (08) :578-588
[8]   MCLUST: Software for model-based cluster analysis [J].
Fraley, C ;
Raftery, AE .
JOURNAL OF CLASSIFICATION, 1999, 16 (02) :297-306
[9]   PROJECTION PURSUIT ALGORITHM FOR EXPLORATORY DATA-ANALYSIS [J].
FRIEDMAN, JH ;
TUKEY, JW .
IEEE TRANSACTIONS ON COMPUTERS, 1974, C 23 (09) :881-890
[10]   EXPLORATORY PROJECTION PURSUIT [J].
FRIEDMAN, JH .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1987, 82 (397) :249-266