Tackling Challenges in Data Pooling: Missing Data Handling in Latent Variable Models with Continuous and Categorical Indicators

被引:2
作者
Chen, Lihan [1 ,2 ]
Miocevic, Milica [1 ]
Falk, Carl F. [1 ]
机构
[1] McGill Univ, Montreal, PQ, Canada
[2] McGill Univ, Dept Psychol, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Data synthesis; missing data; nonnormal data; ordinal data; pooled data analysis; STRUCTURAL EQUATION MODELS; MAXIMUM-LIKELIHOOD-ESTIMATION; NONNORMAL DATA; R PACKAGE; ROBUST CORRECTIONS; INFORMATION; SEM; MULTIVARIATE; PERFORMANCE;
D O I
10.1080/10705511.2023.2300079
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Data pooling is a powerful strategy in empirical research. However, combining multiple datasets often results in a large amount of missing data, as variables that are not present in some datasets effectively contain missing values for all participants in those datasets. Furthermore, data pooling typically leads to a mix of continuous and categorical items with nonnormal multivariate distributions. We investigated two popular approaches to handle missing data in this context: (1) applying direct maximum likelihood by treating data as continuous (con-ML), and (2) applying categorical least squares using a polychoric correlation matrix computed from pairwise deletion (cat-LS). These approaches are available for free and relatively straightforward for empirical researchers to implement. Through simulation studies with confirmatory factor analysis and latent mediation analysis, we found cat-LS to be unsuitable for pooled data analysis, whereas con-ML yielded acceptable performance for the estimation of latent path coefficients barring severe nonnormality.
引用
收藏
页码:651 / 666
页数:16
相关论文
共 58 条
[1]  
Allison P., 1987, SOCIOL METHODOL, V17, P71, DOI [DOI 10.2307/271029, 10.2307/271029]
[2]  
Arbuckle J.L., 1996, Full Information Estimation in the Presence of Incomplete Data, DOI DOI 10.4324/9781315827414
[3]  
Asparouhov T., 2010, Weighted least squares estimation with missing data
[4]   A Cautionary Note on the Use of the Vale and Maurelli Method to Generate Multivariate, Nonnormal Data for Simulation Purposes [J].
Astivia, Oscar L. Olvera ;
Zumbo, Bruno D. .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 2015, 75 (04) :541-567
[5]   Psychometric Approaches for Developing Commensurate Measures Across Independent Studies: Traditional and New Models [J].
Bauer, Daniel J. ;
Hussong, Andrea M. .
PSYCHOLOGICAL METHODS, 2009, 14 (02) :101-125
[6]  
Bentler PM., 1995, EQS Structural Equations Program Manual
[7]  
Bollen K.A., 1989, Structural Equations With Latent Variables
[8]  
Cai L., 2008, A Metropolis-Hastings Robbins-Monro algorithm for maximum likelihood nonlinear latent structure analysis with a comprehensive measurement model
[9]  
Chang Winston, 2024, CRAN
[10]   Pay Attention to the Ignorable Missing Data Mechanisms! An Exploration of Their Impact on the Efficiency of Regression Coefficients [J].
Chen, Lihan ;
Savalei, Victoria ;
Rhemtulla, Mijke .
MULTIVARIATE BEHAVIORAL RESEARCH, 2023, 58 (06) :1134-1159