[1] Univ Helsinki, Dept Comp Sci, FIN-00014 Helsinki, Finland
[2] Aalto Univ, Dept Informat & Comp Sci, Helsinki, Finland
来源:
DISCOVERY SCIENCE, DS 2010
|
2010年
/
6332卷
关键词:
STOCHASTIC COMPLEXITY;
INFORMATION;
D O I:
暂无
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
We introduce a well-grounded minimum description length (MDL) based quality measure for a clustering consisting of either spherical or axis-aligned normally distributed clusters and a cluster with a uniform distribution in an axis-aligned rectangular box. The uniform component extends the practical usability of the model e. g. in the presence of noise, and using the MDL principle for the model selection makes comparing the quality of clusterings with a different number of clusters possible. We also introduce a novel search heuristic for finding the best clustering with an unknown number of clusters. The heuristic is based on the idea of moving points from the Gaussian clusters to the uniform one and using MDL for determining the optimal amount of noise. Tests with synthetic data having a clear cluster structure imply that the search method is effective in finding the intuitively correct clustering.
机构:
Univ Paris Saclay, Univ Paris Sud, CNRS, L2S,Cent Supelec, Gif Sur Yvette, FranceUniv Paris Saclay, Univ Paris Sud, CNRS, L2S,Cent Supelec, Gif Sur Yvette, France
Bect, Julien
Bachoc, Francois
论文数: 0引用数: 0
h-index: 0
机构:
Univ Paul Sabatier, Toulouse Math Inst, Toulouse, FranceUniv Paris Saclay, Univ Paris Sud, CNRS, L2S,Cent Supelec, Gif Sur Yvette, France
Bachoc, Francois
Ginsbourger, David
论文数: 0引用数: 0
h-index: 0
机构:
Idiap Res Inst, Uncertainty Quantificat & Optimal Design Grp, Martigny, Switzerland
Univ Bern, Inst Math Stat & Actuarial Sci, Dept Math & Stat, Bern, SwitzerlandUniv Paris Saclay, Univ Paris Sud, CNRS, L2S,Cent Supelec, Gif Sur Yvette, France
机构:
Univ Paris Saclay, Univ Paris Sud, CNRS, L2S,Cent Supelec, Gif Sur Yvette, FranceUniv Paris Saclay, Univ Paris Sud, CNRS, L2S,Cent Supelec, Gif Sur Yvette, France
Bect, Julien
Bachoc, Francois
论文数: 0引用数: 0
h-index: 0
机构:
Univ Paul Sabatier, Toulouse Math Inst, Toulouse, FranceUniv Paris Saclay, Univ Paris Sud, CNRS, L2S,Cent Supelec, Gif Sur Yvette, France
Bachoc, Francois
Ginsbourger, David
论文数: 0引用数: 0
h-index: 0
机构:
Idiap Res Inst, Uncertainty Quantificat & Optimal Design Grp, Martigny, Switzerland
Univ Bern, Inst Math Stat & Actuarial Sci, Dept Math & Stat, Bern, SwitzerlandUniv Paris Saclay, Univ Paris Sud, CNRS, L2S,Cent Supelec, Gif Sur Yvette, France