IMPROVING BAYESIAN MIXTURE MODELS FOR MULTIPLE IMPUTATION OF MISSING DATA USING FOCUSED CLUSTERING

被引:0
|
作者
Wei, Lan [1 ]
Reiter, Jerome P. [2 ]
机构
[1] In4mat Insights, Needham, MA USA
[2] Duke Univ, Dept Stat Sci, Box 90251, Durham, NC 27708 USA
关键词
incomplete; nonparametric; nonresponse; survey; tensor; DIRICHLET PROCESS MIXTURES; CATEGORICAL-DATA; VALUES; PRIORS;
D O I
暂无
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We present a joint modeling approach for multiple imputation of missing continuous and categorical variables using Bayesian mixture models. The approach extends the idea of focused clustering, in which one separates variables into two sets before estimating the mixture model. Focus variables include variables with high rates of missingness and possibly other variables that could help improve the quality of the imputations. Non-focus variables include the remainder. In this way, one can use a rich sub-model for the focus set and a simpler model for the non-focus set, thereby concentrating fitting power on the variables with the highest rates of missingness. We present a procedure for specifying which variables with low rates of missingness to include in the focus set. We examine the performance of the imputation procedure using simulation studies based on artificial data and on data from the American Community Survey.
引用
收藏
页码:213 / 230
页数:18
相关论文
共 50 条
  • [1] Bayesian Mixture Models with Focused Clustering for Mixed Ordinal and Nominal Data
    DeYoreo, Maria
    Reiter, Jerome P.
    Hillygus, D. Sunshine
    BAYESIAN ANALYSIS, 2017, 12 (03): : 679 - 703
  • [2] Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models With Local Dependence
    Murray, Jared S.
    Reiter, Jerome P.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (516) : 1466 - 1479
  • [3] BAYESIAN IMPUTATION FOR MISSING DATA
    Nads, Azman A.
    Polestico, Daisy Lou L.
    ADVANCES AND APPLICATIONS IN STATISTICS, 2022, 79 : 83 - 104
  • [4] Multiple imputation of longitudinal categorical data through bayesian mixture latent Markov models
    Vidotto, Davide
    Vermunt, Jeroen K.
    Van Deun, Katrijn
    JOURNAL OF APPLIED STATISTICS, 2020, 47 (10) : 1720 - 1738
  • [5] Partial distance evidential clustering for missing data with multiple imputation
    Tian, Hong-Peng
    Zhang, Zhen
    KNOWLEDGE-BASED SYSTEMS, 2025, 310
  • [6] Semiparametric Fractional Imputation Using Gaussian Mixture Models for Handling Multivariate Missing Data
    Sang, Hejian
    Kim, Jae Kwang
    Lee, Danhyang
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2022, 117 (538) : 654 - 663
  • [7] Cooperative Clustering Missing Data Imputation
    Wan, Daoming
    Razavi-Far, Roozbeh
    Saif, Mehrdad
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 1039 - 1045
  • [8] A Bayesian multiple imputation approach to bivariate functional data with missing components
    Jang, Jeong Hoon
    Manatunga, Amita K.
    Chang, Changgee
    Long, Qi
    STATISTICS IN MEDICINE, 2021, 40 (22) : 4772 - 4793
  • [9] A new iterative fuzzy clustering algorithm for multiple imputation of missing data
    Nikfalazar, Sanaz
    Yeh, Chung-Hsing
    Bedingfield, Susan
    Khorshidi, Hadi A.
    2017 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2017,
  • [10] Missing Data and Multiple Imputation
    Cummings, Peter
    JAMA PEDIATRICS, 2013, 167 (07) : 656 - 661