Multiple Imputation of Missing Data in Large Studies With Many Variables: A Fully Conditional Specification Approach Using Partial Least Squares

被引:0
作者
Grund, Simon [1 ]
Luedtke, Oliver [2 ,3 ]
Robitzsch, Alexander [2 ,3 ]
机构
[1] Univ Hamburg, Dept Psychol, Von Melle Pk 5, D-20146 Hamburg, Germany
[2] Leibniz Inst Sci & Math Educ, Dept Educ Measurement & Data Sci, Kiel, Germany
[3] Ctr Int Student Assessment, Munich, Germany
关键词
missing data; multiple imputation; high-dimensional data; composite scores; dimension reduction techniques; CHAINED EQUATIONS; CATEGORICAL-DATA; DATA DESIGNS; MODELS; REGRESSION;
D O I
10.1037/met0000694
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Multiple imputation (MI) is one of the most popular methods for handling missing data in psychological research. However, many imputation approaches are poorly equipped to handle a large number of variables, which are a common sight in studies that employ questionnaires to assess psychological constructs. In such a case, conventional imputation approaches often become unstable and require that the imputation model be simplified, for example, by removing variables or combining them into composite scores. In this article, we propose an alternative method that extends the fully conditional specification approach to MI with dimension reduction techniques such as partial least squares. To evaluate this approach, we conducted a series of simulation studies, in which we compared it with other approaches that were based on variable selection, composite scores, or dimension reduction through principal components analysis. Our findings indicate that this novel approach can provide accurate results even in challenging scenarios, where other approaches fail to do so. Finally, we also illustrate the use of this method in real data and discuss the implications of our findings for practice.
引用
收藏
页数:19
相关论文
共 86 条
  • [81] Multiple imputation using chained equations: Issues and guidance for practice
    White, Ian R.
    Royston, Patrick
    Wood, Angela M.
    [J]. STATISTICS IN MEDICINE, 2011, 30 (04) : 377 - 399
  • [82] Avoiding bias due to perfect prediction in multiple imputation of incomplete categorical variables
    White, Ian R.
    Daniel, Rhian
    Royston, Patrick
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (10) : 2267 - 2275
  • [83] PLS-regression:: a basic tool of chemometrics
    Wold, S
    Sjöström, M
    Eriksson, L
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2001, 58 (02) : 109 - 130
  • [84] NONLINEAR PLS MODELING
    WOLD, S
    KETTANEHWOLD, N
    SKAGERBERG, B
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1989, 7 (1-2) : 53 - 65
  • [85] Evaluating Model Fit for Growth Curve Models: Integration of Fit Indices From SEM and MLM Frameworks
    Wu, Wei
    West, Stephen G.
    Taylor, Aaron B.
    [J]. PSYCHOLOGICAL METHODS, 2009, 14 (03) : 183 - 201
  • [86] Multiple imputation in the presence of high-dimensional data
    Zhao, Yize
    Long, Qi
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2016, 25 (05) : 2021 - 2035