Gaussian Clusters and Noise: An Approach Based on the Minimum Description Length Principle

被引:0
作者
Luosto, Panu [1 ]
Kivinen, Jyrki [1 ]
Mannila, Heikki [2 ]
机构
[1] Univ Helsinki, Dept Comp Sci, FIN-00014 Helsinki, Finland
[2] Aalto Univ, Dept Informat & Comp Sci, Helsinki, Finland
来源
DISCOVERY SCIENCE, DS 2010 | 2010年 / 6332卷
关键词
STOCHASTIC COMPLEXITY; INFORMATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a well-grounded minimum description length (MDL) based quality measure for a clustering consisting of either spherical or axis-aligned normally distributed clusters and a cluster with a uniform distribution in an axis-aligned rectangular box. The uniform component extends the practical usability of the model e. g. in the presence of noise, and using the MDL principle for the model selection makes comparing the quality of clusterings with a different number of clusters possible. We also introduce a novel search heuristic for finding the best clustering with an unknown number of clusters. The heuristic is based on the idea of moving points from the Gaussian clusters to the uniform one and using MDL for determining the optimal amount of noise. Tests with synthetic data having a clear cluster structure imply that the search method is effective in finding the intuitively correct clustering.
引用
收藏
页码:251 / 265
页数:15
相关论文
共 38 条
  • [21] On the Minimum Mean p-th Error in Gaussian Noise Channels and Its Applications
    Dytso, Alex
    Bustin, Ronit
    Tuninetti, Daniela
    Devroye, Natasha
    Poor, H. Vincent
    Shamai , Shlomo
    2016 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 2016, : 1646 - 1650
  • [22] Recognition memory models and binary-response ROCs: A comparison by minimum description length
    Kellen, David
    Klauer, Karl Christoph
    Broeder, Arndt
    PSYCHONOMIC BULLETIN & REVIEW, 2013, 20 (04) : 693 - 719
  • [23] A Deterministic Chaos-Model-Based Gaussian Noise Generator
    Haliuk, Serhii
    Vovchuk, Dmytro
    Spinazzola, Elisabetta
    Secco, Jacopo
    Bobrovs, Vjaceslavs
    Corinto, Fernando
    ELECTRONICS, 2024, 13 (07)
  • [24] Multivariate bounded support asymmetric generalized Gaussian mixture model with model selection using minimum message length
    Azam, Muhammad
    Bouguila, Nizar
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 204
  • [25] A supermartingale approach to Gaussian process based sequential design of experiments
    Bect, Julien
    Bachoc, Francois
    Ginsbourger, David
    BERNOULLI, 2019, 25 (4A) : 2883 - 2919
  • [26] The Conservatism Principle and the Asymmetric Timeliness of Earnings: An Event-Based Approach
    Shroff, Pervin K.
    Venkataraman, Ramgopal
    Zhang, Suning
    CONTEMPORARY ACCOUNTING RESEARCH, 2013, 30 (01) : 215 - 241
  • [27] A Fuzzy Clustering Approach for Complex Color Image Segmentation Based on Gaussian Model with Interactions between Color Planes and Mixture Gaussian Model
    Zhao, Xuemei
    Li, Yu
    Zhao, Quanhua
    INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2018, 20 (01) : 309 - 317
  • [28] Constructing a meta-model for assembly tolerance types with a description logic based approach
    Zhong, Yanru
    Qin, Yuchu
    Huang, Meifa
    Lu, Wenlong
    Chang, Liang
    COMPUTER-AIDED DESIGN, 2014, 48 : 1 - 16
  • [29] GFO: A data driven approach for optimizing the Gaussian function based similarity metric in computational biology
    Lei, Jian-Bo
    Yin, Jiang-Bo
    Shen, Hong-Bin
    NEUROCOMPUTING, 2013, 99 : 307 - 315
  • [30] A Gaussian mixture model based virtual sample generation approach for small datasets in industrial processes
    Li, Ling
    Damarla, Seshu Kumar
    Wang, Yalin
    Huang, Biao
    INFORMATION SCIENCES, 2021, 581 : 262 - 277