Variance Matrix Priors for Dirichlet Process Mixture Models With Gaussian Kernels

被引:1
|
作者
Jing, Wei [1 ]
Papathomas, Michail [1 ]
Liverani, Silvia [2 ,3 ]
机构
[1] Univ St Andrews, Sch Math & Stat, St Andrews, Scotland
[2] Queen Mary Univ London, Sch Math Sci, London, England
[3] Alan Turing Inst, British Lib, London, England
关键词
Bayesian non-parametrics; clustering; BAYESIAN VARIABLE SELECTION; PRIOR DISTRIBUTIONS; PROFILE REGRESSION; NUMBER; LASSO;
D O I
10.1111/insr.12595
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Bayesian mixture modelling is widely used for density estimation and clustering. The Dirichlet process mixture model (DPMM) is the most popular Bayesian non-parametric mixture modelling approach. In this manuscript, we study the choice of prior for the variance or precision matrix when Gaussian kernels are adopted. Typically, in the relevant literature, the assessment of mixture models is done by considering observations in a space of only a handful of dimensions. Instead, we are concerned with more realistic problems of higher dimensionality, in a space of up to 20 dimensions. We observe that the choice of prior is increasingly important as the dimensionality of the problem increases. After identifying certain undesirable properties of standard priors in problems of higher dimensionality, we review and implement possible alternative priors. The most promising priors are identified, as well as other factors that affect the convergence of MCMC samplers. Our results show that the choice of prior is critical for deriving reliable posterior inferences. This manuscript offers a thorough overview and comparative investigation into possible priors, with detailed guidelines for their implementation. Although our work focuses on the use of the DPMM in clustering, it is also applicable to density estimation.
引用
收藏
页数:25
相关论文
共 50 条
  • [41] Hybrid Dirichlet mixture models for functional data
    Petrone, Sonia
    Guindani, Michele
    Gelfand, Alan E.
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2009, 71 : 755 - 782
  • [42] Dependent generalized Dirichlet process priors for the analysis of acute lymphoblastic leukemia
    Barcella, William
    De Iorio, Maria
    Favaro, Stefano
    Rosner, Gary L.
    BIOSTATISTICS, 2018, 19 (03) : 342 - 358
  • [43] The Use of Informed Priors in Biclustering of Gene Expression with the Hierarchical Dirichlet Process
    Tercan, Bahar
    Acar, Aybar C.
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (05) : 1810 - 1821
  • [44] A Dirichlet process mixture model for clustering longitudinal gene expression data
    Sun, Jiehuan
    Herazo-Maya, Jose D.
    Kaminski, Naftali
    Zhao, Hongyu
    Warren, Joshua L.
    STATISTICS IN MEDICINE, 2017, 36 (22) : 3495 - 3506
  • [45] Clusterability assessment for Gaussian mixture models
    Nowakowska, Ewa
    Koronacki, Jacek
    Lipovetsky, Stan
    APPLIED MATHEMATICS AND COMPUTATION, 2015, 256 : 591 - 601
  • [46] Mining Numbers in Text Using Suffix Arrays and Clustering Based on Dirichlet Process Mixture Models
    Yoshida, Minoru
    Sato, Issei
    Nakagawa, Hiroshi
    Terada, Akira
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II, PROCEEDINGS, 2010, 6119 : 230 - +
  • [47] PReMiuM: An R Package for Profile Regression Mixture Models Using Dirichlet Processes
    Liverani, Silvia
    Hastie, David I.
    Azizi, Lamiae
    Papathomas, Michail
    Richardson, Sylvia
    JOURNAL OF STATISTICAL SOFTWARE, 2015, 64 (07): : 1 - 30
  • [48] Hyperspectral Image Segmentation Using The Dirichlet Mixture Models
    Sigirci, Ibrahim Onur
    Bilgin, Gokhan
    2014 22ND SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2014, : 983 - 986
  • [49] Clustering with label constrained Dirichlet process mixture model
    Burhanuddin, Nurul Afiqah
    Adam, Mohd Bakri
    Ibrahim, Kamarulzaman
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 107
  • [50] Mean field inference for the Dirichlet process mixture model
    Zobay, O.
    ELECTRONIC JOURNAL OF STATISTICS, 2009, 3 : 507 - 545