Centering Categorical Predictors in Multilevel Models: Best Practices and Interpretation

被引:118
作者
Yaremych, Haley E. [1 ]
Preacher, Kristopher J. [1 ]
Hedeker, Donald [2 ]
机构
[1] Vanderbilt Univ, Dept Psychol & Human Dev, PMB 552,230 Appleton Pl, Nashville, TN 37203 USA
[2] Univ Chicago, Dept Publ Hlth Sci, Chicago, IL 60637 USA
关键词
multilevel modeling; hierarchical linear modeling; centering; categorical predictors; binary predictors; SCHOOL; CLASSROOM; FAMILY; LEVEL; ACHIEVEMENT; VARIABLES; CLIMATE; CLUSTER; MULTICOLLINEARITY; PERFORMANCE;
D O I
10.1037/met0000434
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
The topic of centering in multilevel modeling (MLM) has received substantial attention from methodologists, as different centering choices for lower-level predictors present important ramifications for the estimation and interpretation of model parameters. However, the centering literature has focused almost exclusively on continuous predictors, with little attention paid to whether and how categorical predictors should be centered, despite their ubiquity across applied fields. Alongside this gap in the methodological literature, a review of applied articles showed that researchers center categorical predictors infrequently and inconsistently. Algebraically and statistically, continuous and categorical predictors behave the same, but researchers using them do not, and for many, interpreting the effects of categorical predictors is not intuitive. Thus, the goals of this tutorial article are twofold: to clarify why and how categorical predictors should be centered in MLM, and to explain how multilevel regression coefficients resulting from centered categorical predictors should be interpreted. We first provide algebraic support showing that uncentered coding variables result in a conflated blend of the within- and between-cluster effects of a multicategorical predictor, whereas appropriate centering techniques yield level-specific effects. Next, we provide algebraic derivations to illuminate precisely how the within- and between-cluster effects of a multicategorical predictor should be interpreted under dummy, contrast, and effect coding schemes. Finally, we provide a detailed demonstration of our conclusions with an empirical example. Implications for practice, including relevance of our findings to categorical control variables (i.e., covariates), interaction terms with categorical focal predictors, and multilevel latent variable models, are discussed. Translational Abstract Multilevel modeling (MLM) is frequently used in the social sciences when data are nested or clustered (e.g., students nested within classrooms; clients nested within therapists). Centering is an important topic in MLM because it can be conducted in different ways, each of which yields slightly different parameter estimates that also must be interpreted differently. However, work regarding centering has focused almost exclusively on continuous predictors. Little attention has been paid to categorical predictors, whether and how they should be centered, and how their resulting coefficients should be interpreted. This is problematic, because categorical predictors and covariates are ubiquitous across all fields wherein MLM is used. Thus, the goals of this report are to clarify why and how categorical predictors should be centered in MLM, and to explain how multilevel regression coefficients resulting from centered categorical predictors should be interpreted. We present an overview of popular centering options and provide best-practice recommendations for centering and interpretation of binary and multicategorical predictors. We provide a detailed demonstration of our conclusions with an empirical example from the education literature. In addition, we discuss the practical implications of our work at length; topics include multicategorical covariates, interaction terms with categorical focal predictors, and multilevel latent variable models.
引用
收藏
页码:613 / 630
页数:18
相关论文
共 78 条
[1]  
[Anonymous], 2018, R: A Language and Environment for Statistical Computing
[2]  
[Anonymous], 2013, The effects of multicollinearity in multilevel models
[3]   Impact of High-Performance Work Systems on Individual- and Branch-Level Performance: Test of a Multilevel Model of Intermediate Linkages [J].
Aryee, Samuel ;
Walumbwa, Fred O. ;
Seidu, Emmanuel Y. M. ;
Otaye, Lilian E. .
JOURNAL OF APPLIED PSYCHOLOGY, 2012, 97 (02) :287-300
[4]  
Asparouhov T., 2006, Mplus Web Notes, V11, P1
[5]   Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models [J].
Asparouhov, Tihomir ;
Muthen, Bengt .
STRUCTURAL EQUATION MODELING-A MULTIDISCIPLINARY JOURNAL, 2019, 26 (01) :119-142
[6]  
Aufenanger Tobias., 2017, FAU Discussion Papers in Economics
[7]   Exogenous Melatonin Application Delays Senescence and Improves Postharvest Antioxidant Capacity in Blueberries [J].
Li, Jie ;
Wang, Ying ;
Li, Jinying ;
Li, Yanan ;
Lu, Chunze ;
Hou, Zihuan ;
Liu, Haiguang ;
Wu, Lin .
AGRONOMY-BASEL, 2025, 15 (02)
[8]   Separation of individual-level and cluster-level covariate effects in regression analysis of correlated data [J].
Begg, MD ;
Parides, MK .
STATISTICS IN MEDICINE, 2003, 22 (16) :2591-2602
[9]  
Bowers AJ, 2011, J EDUC FINANC, V37, P72
[10]   Centering Predictor Variables in Three-Level Contextual Models [J].
Brincks, Ahnalee M. ;
Enders, Craig K. ;
Llabre, Maria M. ;
Bulotsky-Shearer, Rebecca J. ;
Prado, Guillermo ;
Feaster, Daniel J. .
MULTIVARIATE BEHAVIORAL RESEARCH, 2017, 52 (02) :149-163