Variance Matrix Priors for Dirichlet Process Mixture Models With Gaussian Kernels

被引:1
|
作者
Jing, Wei [1 ]
Papathomas, Michail [1 ]
Liverani, Silvia [2 ,3 ]
机构
[1] Univ St Andrews, Sch Math & Stat, St Andrews, Scotland
[2] Queen Mary Univ London, Sch Math Sci, London, England
[3] Alan Turing Inst, British Lib, London, England
关键词
Bayesian non-parametrics; clustering; BAYESIAN VARIABLE SELECTION; PRIOR DISTRIBUTIONS; PROFILE REGRESSION; NUMBER; LASSO;
D O I
10.1111/insr.12595
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Bayesian mixture modelling is widely used for density estimation and clustering. The Dirichlet process mixture model (DPMM) is the most popular Bayesian non-parametric mixture modelling approach. In this manuscript, we study the choice of prior for the variance or precision matrix when Gaussian kernels are adopted. Typically, in the relevant literature, the assessment of mixture models is done by considering observations in a space of only a handful of dimensions. Instead, we are concerned with more realistic problems of higher dimensionality, in a space of up to 20 dimensions. We observe that the choice of prior is increasingly important as the dimensionality of the problem increases. After identifying certain undesirable properties of standard priors in problems of higher dimensionality, we review and implement possible alternative priors. The most promising priors are identified, as well as other factors that affect the convergence of MCMC samplers. Our results show that the choice of prior is critical for deriving reliable posterior inferences. This manuscript offers a thorough overview and comparative investigation into possible priors, with detailed guidelines for their implementation. Although our work focuses on the use of the DPMM in clustering, it is also applicable to density estimation.
引用
收藏
页数:25
相关论文
共 50 条
  • [31] Dirichlet Process Mixture Models made Scalable and Effective by means of Massive Distribution
    Meguelati, Khadidja
    Fontez, Benedicte
    Hilgert, Nadine
    Masseglia, Florent
    SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 502 - 509
  • [32] High Dimensional Data Clustering by means of Distributed Dirichlet Process Mixture Models
    Meguelati, Khadidja
    Fontez, Benedicte
    Hilgert, Nadine
    Masseglia, Florent
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 890 - 899
  • [33] Online damage detection of cutting tools using Dirichlet process mixture models?
    Wickramarachchi, Chandula T.
    Rogers, Timothy J.
    McLeay, Thomas E.
    Leahy, Wayne
    Cross, Elizabeth J.
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2022, 180
  • [34] Default priors for density estimation with mixture models
    Griffin, J. E.
    BAYESIAN ANALYSIS, 2010, 5 (01): : 45 - 64
  • [35] Axially Symmetric Data Clustering Through Dirichlet Process Mixture Models of Watson Distributions
    Fan, Wentao
    Bouguila, Nizar
    Du, Ji-Xiang
    Liu, Xin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (06) : 1683 - 1694
  • [36] On the small sample behavior of Dirichlet process mixture models for data supported on compact intervals
    Wehrhahn, Claudia
    Jara, Alejandro
    Barrientos, Andres F.
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (03) : 786 - 810
  • [37] Dirichlet process mixture models for single-cell RNA-seq clustering
    Adossa, Nigatu A.
    Rytkonen, Kalle T.
    Elo, Laura L.
    BIOLOGY OPEN, 2022, 11 (04):
  • [38] Lower bounds for posterior rates with Gaussian process priors
    Castillo, Ismael
    ELECTRONIC JOURNAL OF STATISTICS, 2008, 2 : 1281 - 1299
  • [39] A hierarchical Dirichlet process mixture of generalized Dirichlet distributions for feature selection
    Fan, Wentao
    Sallay, Hassen
    Bouguila, Nizar
    Bourouis, Sami
    COMPUTERS & ELECTRICAL ENGINEERING, 2015, 43 : 48 - 65
  • [40] Research on dirichlet process mixture model for clustering
    Zhang B.
    Zhang K.
    Zhong L.
    Zhang X.
    Ingenierie des Systemes d'Information, 2019, 24 (02): : 183 - 189