Gaussian parsimonious clustering models with covariates and a noise component

被引:28
|
作者
Murphy, Keefe [1 ,2 ]
Murphy, Thomas Brendan [1 ,2 ]
机构
[1] Univ Coll Dublin, Sch Math & Stat, Dublin, Ireland
[2] Univ Coll Dublin, Insight Ctr Data Analyt, Dublin, Ireland
基金
爱尔兰科学基金会;
关键词
Model-based clustering; Mixtures of experts; EM algorithm; Parsimony; Multivariate response; Covariates; Noise component; FINITE MIXTURES; R PACKAGE; CLASSIFICATION; REGRESSIONS; LIKELIHOOD; VARIABLES; EXPERTS;
D O I
10.1007/s11634-019-00373-8
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider model-based clustering methods for continuous, correlated data that account for external information available in the presence of mixed-type fixed covariates by proposing the MoEClust suite of models. These models allow different subsets of covariates to influence the component weights and/or component densities by modelling the parameters of the mixture as functions of the covariates. A familiar range of constrained eigen-decomposition parameterisations of the component covariance matrices are also accommodated. This paper thus addresses the equivalent aims of including covariates in Gaussian parsimonious clustering models and incorporating parsimonious covariance structures into all special cases of the Gaussian mixture of experts framework. The MoEClust models demonstrate significant improvement from both perspectives in applications to both univariate and multivariate data sets. Novel extensions to include a uniform noise component for capturing outliers and to address initialisation of the EM algorithm, model selection, and the visualisation of results are also proposed.
引用
收藏
页码:293 / 325
页数:33
相关论文
共 50 条
  • [31] mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models
    Scrucca, Luca
    Fop, Michael
    Murphy, T. Brendan
    Raftery, Adrian E.
    R JOURNAL, 2016, 8 (01): : 289 - 317
  • [32] Time series clustering with an EM algorithm for mixtures of linear Gaussian state space models
    Umatani, Ryohei
    Imai, Takashi
    Kawamoto, Kaoru
    Kunimasa, Shutaro
    PATTERN RECOGNITION, 2023, 138
  • [33] On categorical time series models with covariates
    Fokianos, Konstantinos
    Truquet, Lionel
    STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 2019, 129 (09) : 3446 - 3462
  • [34] Robust clustering of COVID-19 cases across US counties using mixtures of asymmetric time series models with time varying and freely indexed covariates
    Maleki, Mohsen
    Bidram, Hamid
    Wraith, Darren
    JOURNAL OF APPLIED STATISTICS, 2023, 50 (11-12) : 2648 - 2662
  • [35] Constrained parsimonious model-based clustering
    Garcia-Escudero, Luis A.
    Mayo-Iscar, Agustin
    Riani, Marco
    STATISTICS AND COMPUTING, 2022, 32 (01)
  • [36] Adaptive fault detection and diagnosis using parsimonious Gaussian mixture models trained with distributed computing techniques
    Nakamura, Thiago A.
    Palhares, Reinaldo M.
    Caminhas, Walmir M.
    Menezes, Benjamin R.
    de Campos, Mario Cesar M. M.
    Fumega, Ubirajara
    Bomfim, Carlos H. de M.
    Lemos, Andre P.
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2017, 354 (06): : 2543 - 2572
  • [37] Image segmentation using spectral clustering of Gaussian mixture models
    Zeng, Shan
    Huang, Rui
    Kang, Zhen
    Sang, Nong
    NEUROCOMPUTING, 2014, 144 : 346 - 356
  • [38] GAUSSIAN MIXTURE MODELS FOR CLUSTERING AND CALIBRATION OF ENSEMBLE WEATHER FORECASTS
    Jouan, Gabriel
    Cuzol, Anne
    Monbet, Valerie
    Monnier, Goulven
    DISCRETE AND CONTINUOUS DYNAMICAL SYSTEMS-SERIES S, 2023, 16 (02): : 309 - 328
  • [39] Row mixture-based clustering with covariates for ordinal responses
    Preedalikit, Kemmawadee
    Fernandez, Daniel
    Liu, Ivy
    McMillan, Louise
    Nai Ruscone, Marta
    Costilla, Roy
    COMPUTATIONAL STATISTICS, 2024, 39 (05) : 2511 - 2555
  • [40] Constrained parsimonious model-based clustering
    Luis A. García-Escudero
    Agustín Mayo-Iscar
    Marco Riani
    Statistics and Computing, 2022, 32