Analysis of Interactions and Nonlinear Effects with Missing Data: A Factored Regression Modeling Approach Using Maximum Likelihood Estimation

被引:12
作者
Luedtke, Oliver [1 ,2 ]
Robitzsch, Alexander [1 ,2 ]
West, Stephen G. [3 ]
机构
[1] Leibniz Inst Sci & Math Educ, Kiel, Germany
[2] Ctr Int Student Assessment, Munich, Germany
[3] Arizona State Univ, Tempe, AZ 85287 USA
关键词
Multiple regression; missing data; interaction effects; maximum likelihood estimation; GENERALIZED LINEAR-MODELS; MULTIPLE IMPUTATION; INCOMPLETE DATA; COVARIANCE; STRATEGIES; INFERENCE; RESPONSES; PRODUCTS; VALUES; XS;
D O I
10.1080/00273171.2019.1640104
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
When estimating multiple regression models with incomplete predictor variables, it is necessary to specify a joint distribution for the predictor variables. A convenient assumption is that this distribution is a multivariate normal distribution, which is also the default in many statistical software packages. This distribution will in general be misspecified if predictors with missing data have nonlinear effects (e.g., x(2)) or are included in interaction terms (e.g., x center dot z). In the present article, we introduce a factored regression modeling approach for estimating regression models with missing data that is based on maximum likelihood estimation. In this approach, the model likelihood is factorized into a part that is due to the model of interest and a part that is due to the model for the incomplete predictors. In three simulation studies, we showed that the factored regression modeling approach produced valid estimates of interaction and nonlinear effects in regression models with missing values on categorical or continuous predictor variables under a broad range of conditions. We developed the R package mdmb, which facilitates a user-friendly application of the factored regression modeling approach, and present a real-data example that illustrates the flexibility of the software.
引用
收藏
页码:361 / 381
页数:21
相关论文
共 67 条
[51]   lavaan: An R Package for Structural Equation Modeling [J].
Rosseel, Yves .
JOURNAL OF STATISTICAL SOFTWARE, 2012, 48 (02) :1-36
[52]   INFERENCE AND MISSING DATA [J].
RUBIN, DB .
BIOMETRIKA, 1976, 63 (03) :581-590
[53]   Missing data: Our view of the state of the art [J].
Schafer, JL ;
Graham, JW .
PSYCHOLOGICAL METHODS, 2002, 7 (02) :147-177
[54]   Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods [J].
Seaman, Shaun R. ;
Bartlett, Jonathan W. ;
White, Ian R. .
BMC MEDICAL RESEARCH METHODOLOGY, 2012, 12
[55]   A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables [J].
Smithson, M ;
Verkuilen, J .
PSYCHOLOGICAL METHODS, 2006, 11 (01) :54-71
[56]  
Snijders T., 2012, MULTILEVEL ANAL INTR
[57]  
Stasinopoulos M. D., 2017, FLEXIBLE REGRESSION, DOI [10.1201/b21973, DOI 10.1201/B21973]
[58]   Maximum likelihood methods for nonignorable missing responses and covariates in random effects models [J].
Stubbendick, AL ;
Ibrahim, JG .
BIOMETRICS, 2003, 59 (04) :1140-1150
[59]   REGRESSION WITH MISSING YS: AN IMPROVED STRATEGY FOR ANALYZING MULTIPLY IMPUTED DATA [J].
von Hippel, Paul T. .
SOCIOLOGICAL METHODOLOGY 2007, VOL 37, 2007, 37 :83-117
[60]   HOW TO IMPUTE INTERACTIONS, SQUARES AND OTHER TRANSFORMED VARIABLES [J].
von Hippel, Paul T. .
SOCIOLOGICAL METHODOLOGY 2009, VOL 39, 2009, 39 :265-291