Bayesian Estimation of Discrete Multivariate Latent Structure Models With Structural Zeros

被引:33
|
作者
Manrique-Vallier, Daniel [1 ]
Reiter, Jerome P. [2 ]
机构
[1] Indiana Univ, Dept Stat, Bloomington, IN 47408 USA
[2] Duke Univ, Durham, NC 27708 USA
基金
美国国家科学基金会;
关键词
Dirichlet process; Latent class; Multinomial; Disclosure risk; Confidentiality; Contingency table; LEVEL MIXTURE-MODELS; MULTIPLE IMPUTATION; POPULATION-SIZE; CONTINGENCY-TABLES; CATEGORICAL-DATA; PRIORS; DEFINITION; MULTILEVEL; MICRODATA; RISK;
D O I
10.1080/10618600.2013.844700
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In multivariate categorical data, models based on conditional independence assumptions, such as latent class models, offer efficient estimation of complex dependencies. However, Bayesian versions of latent structure models for categorical data typically do not appropriately handle impossible combinations of variables, also known as structural zeros. Allowing nonzero probability for impossible combinations results in inaccurate estimates of joint and conditional probabilities, even for feasible combinations. We present an approach for estimating posterior distributions in Bayesian latent structure models with potentially many structural zeros. The basic idea is to treat the observed data as a truncated sample from an augmented dataset, thereby allowing us to exploit the conditional independence assumptions for computational expediency. As part of the approach, we develop an algorithm for collapsing a large set of structural zero combinations into a much smaller set of disjoint marginal conditions, which speeds up computation. We apply the approach to sample from a semiparametric version of the latent class model with structural zeros in the context of a key issue faced by national statistical agencies seeking to disseminate confidential data to the public: estimating the number of records in a sample that are unique in the population on a set of publicly available categorical variables. The latent class model offers remarkably accurate estimates of population uniqueness, even in the presence of a large number of structural zeros.
引用
收藏
页码:1061 / 1079
页数:19
相关论文
共 50 条