共 50 条
Quantile-based clustering
被引:5
|作者:
Hennig, Christian
[1
]
Viroli, Cinzia
[1
]
Anderlucci, Laura
[1
]
机构:
[1] Univ Bologna, Dept Stat Sci, Via Belle Arti 41, I-40126 Bologna, Italy
来源:
关键词:
Fixed partition model;
quantile discrepancy;
high dimensional clustering;
nonparametric mixture;
CLASSIFICATION;
CONSISTENCY;
D O I:
10.1214/19-EJS1640
中图分类号:
O21 [概率论与数理统计];
C8 [统计学];
学科分类号:
020208 ;
070103 ;
0714 ;
摘要:
A new cluster analysis method, K-quantiles clustering, is introduced. K-quantiles clustering can be computed by a simple greedy algorithm in the style of the classical Lloyd's algorithm for K-means. It can be applied to large and high-dimensional datasets. It allows for within-cluster skewness and internal variable scaling based on within-cluster variation. Different versions allow for different levels of parsimony and computational efficiency. Although K-quantiles clustering is conceived as nonparametric, it can be connected to a fixed partition model of generalized asymmetric Laplace-distributions. The consistency of K-quantiles clustering is proved, and it is shown that K-quantiles clusters correspond to well separated mixture components in a nonparametric mixture. In a simulation, K-quantiles clustering is compared with a number of popular clustering methods with good results. A high-dimensional microarray dataset is clustered by K-quantiles.
引用
收藏
页码:4849 / 4883
页数:35
相关论文