Bayesian Estimation of Discrete Multivariate Latent Structure Models With Structural Zeros

被引:33
作者
Manrique-Vallier, Daniel [1 ]
Reiter, Jerome P. [2 ]
机构
[1] Indiana Univ, Dept Stat, Bloomington, IN 47408 USA
[2] Duke Univ, Durham, NC 27708 USA
基金
美国国家科学基金会;
关键词
Dirichlet process; Latent class; Multinomial; Disclosure risk; Confidentiality; Contingency table; LEVEL MIXTURE-MODELS; MULTIPLE IMPUTATION; POPULATION-SIZE; CONTINGENCY-TABLES; CATEGORICAL-DATA; PRIORS; DEFINITION; MULTILEVEL; MICRODATA; RISK;
D O I
10.1080/10618600.2013.844700
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In multivariate categorical data, models based on conditional independence assumptions, such as latent class models, offer efficient estimation of complex dependencies. However, Bayesian versions of latent structure models for categorical data typically do not appropriately handle impossible combinations of variables, also known as structural zeros. Allowing nonzero probability for impossible combinations results in inaccurate estimates of joint and conditional probabilities, even for feasible combinations. We present an approach for estimating posterior distributions in Bayesian latent structure models with potentially many structural zeros. The basic idea is to treat the observed data as a truncated sample from an augmented dataset, thereby allowing us to exploit the conditional independence assumptions for computational expediency. As part of the approach, we develop an algorithm for collapsing a large set of structural zero combinations into a much smaller set of disjoint marginal conditions, which speeds up computation. We apply the approach to sample from a semiparametric version of the latent class model with structural zeros in the context of a key issue faced by national statistical agencies seeking to disseminate confidential data to the public: estimating the number of records in a sample that are unique in the population on a set of publicly available categorical variables. The latent class model offers remarkably accurate estimates of population uniqueness, even in the presence of a large number of structural zeros.
引用
收藏
页码:1061 / 1079
页数:19
相关论文
共 50 条
  • [31] Information-theoretic latent distribution modeling: Distinguishing discrete and continuous latent variable models
    Markon, Kristian E.
    Krueger, Robert F.
    PSYCHOLOGICAL METHODS, 2006, 11 (03) : 228 - 243
  • [32] A semiparametric Bayesian approach for structural equation models
    Song, Xin-Yuan
    Pan, Jun-Hao
    Kwok, Timothy
    Vandenput, Liesbeth
    Ohlsson, Claes
    Leung, Ping-Chung
    BIOMETRICAL JOURNAL, 2010, 52 (03) : 314 - 332
  • [33] Supervised Bayesian latent class models for high-dimensional data
    Desantis, Stacia M.
    Houseman, E. Andres
    Coull, Brent A.
    Nutt, Catherine L.
    Betensky, Rebecca A.
    STATISTICS IN MEDICINE, 2012, 31 (13) : 1342 - 1360
  • [34] Are Bayesian regularization methods a must for multilevel dynamic latent variables models?
    Andriamiarana, Vivato V.
    Kilian, Pascal
    Brandt, Holger
    Kelava, Augustin
    BEHAVIOR RESEARCH METHODS, 2025, 57 (02)
  • [35] Computational Strategies and Estimation Performance With Bayesian Semiparametric Item Response Theory Models
    Paganin, Sally
    Paciorek, Christopher J.
    Wehrhahn, Claudia
    Rodriguez, Abel
    Rabe-Hesketh, Sophia
    de Valpine, Perry
    JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 2023, 48 (02) : 147 - 188
  • [36] Evaluation of Bayesian spatiotemporal latent models in small area health data
    Choi, Jungsoon
    Lawson, Andrew B.
    Cai, Bo
    Hossain, Md. Monir
    ENVIRONMETRICS, 2011, 22 (08) : 1008 - 1022
  • [37] Bayesian Spike Sorting: Parametric and Nonparametric Multivariate Gaussian Mixture Models
    White, Nicole
    van Havre, Zoe
    Rousseau, Judith
    Mengersen, Kerrie L.
    CASE STUDIES IN APPLIED BAYESIAN DATA SCIENCE: CIRM JEAN-MORLET CHAIR, FALL 2018, 2020, 2259 : 215 - 227
  • [38] Intrinsic Bayesian estimation of linear time series models
    Ni, Shawn
    Sun, Dongchu
    STATISTICAL THEORY AND RELATED FIELDS, 2021, 5 (04) : 275 - 287
  • [39] Bayesian Generalized Horseshoe Estimation of Generalized Linear Models
    Schmidt, Daniel F.
    Makalic, Enes
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 11907 : 598 - 613
  • [40] On the distinguished role of the multivariate exponential distribution in Bayesian estimation in competing risks problems
    Neath, AA
    Samaniego, FJ
    STATISTICS & PROBABILITY LETTERS, 1996, 31 (01) : 69 - 74