Multiple Imputation of Missing Data in Large Studies With Many Variables: A Fully Conditional Specification Approach Using Partial Least Squares

被引:0
作者
Grund, Simon [1 ]
Luedtke, Oliver [2 ,3 ]
Robitzsch, Alexander [2 ,3 ]
机构
[1] Univ Hamburg, Dept Psychol, Von Melle Pk 5, D-20146 Hamburg, Germany
[2] Leibniz Inst Sci & Math Educ, Dept Educ Measurement & Data Sci, Kiel, Germany
[3] Ctr Int Student Assessment, Munich, Germany
关键词
missing data; multiple imputation; high-dimensional data; composite scores; dimension reduction techniques; CHAINED EQUATIONS; CATEGORICAL-DATA; DATA DESIGNS; MODELS; REGRESSION;
D O I
10.1037/met0000694
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Multiple imputation (MI) is one of the most popular methods for handling missing data in psychological research. However, many imputation approaches are poorly equipped to handle a large number of variables, which are a common sight in studies that employ questionnaires to assess psychological constructs. In such a case, conventional imputation approaches often become unstable and require that the imputation model be simplified, for example, by removing variables or combining them into composite scores. In this article, we propose an alternative method that extends the fully conditional specification approach to MI with dimension reduction techniques such as partial least squares. To evaluate this approach, we conducted a series of simulation studies, in which we compared it with other approaches that were based on variable selection, composite scores, or dimension reduction through principal components analysis. Our findings indicate that this novel approach can provide accurate results even in challenging scenarios, where other approaches fail to do so. Finally, we also illustrate the use of this method in real data and discuss the implications of our findings for practice.
引用
收藏
页数:19
相关论文
共 86 条
  • [1] Principal component analysis
    Abdi, Herve
    Williams, Lynne J.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (04): : 433 - 459
  • [2] Partial least squares regression and projection on latent structure regression (PLS Regression)
    Abdi, Herve
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (01): : 97 - 106
  • [3] A Factored Regression Model for Composite Scores With Item-Level Missing Data
    Alacam, Egamaria
    Enders, Craig K. K.
    Du, Han
    Keller, Brian T. T.
    [J]. PSYCHOLOGICAL METHODS, 2023,
  • [4] [Anonymous], 2019, PISA 2018 Technical Report
  • [5] MIMCA: multiple imputation for categorical variables with multiple correspondence analysis
    Audigier, Vincent
    Husson, Francois
    Josse, Julie
    [J]. STATISTICS AND COMPUTING, 2017, 27 (02) : 501 - 518
  • [6] Multiple imputation for continuous variables using a Bayesian principal component analysis
    Audigier, Vincent
    Husson, Francois
    Josse, Julie
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2016, 86 (11) : 2140 - 2156
  • [7] Logistic regression vs. predictive mean matching for imputing binary covariates
    Austin, Peter C.
    van Buuren, Stef
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2023, 32 (11) : 2172 - 2183
  • [8] General-purpose imputation of planned missing data in social surveys: Different strategies and their effect on correlations
    Axenfeld, Julian B.
    Bruch, Christian
    Wolf, Christof
    [J]. STATISTICS SURVEYS, 2022, 16 : 182 - 209
  • [9] Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model
    Bartlett, Jonathan W.
    Seaman, Shaun R.
    White, Ian R.
    Carpenter, James R.
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2015, 24 (04) : 462 - 487
  • [10] Bolger N., 2013, INTENSIVE LONGITUDIN