Modeling Item-Level Heterogeneous Treatment Effects With the Explanatory Item Response Model: Leveraging Large-Scale Online Assessments to Pinpoint the Impact of Educational Interventions

被引:14
作者
Gilbert, Joshua B. [1 ]
Kim, James S. [1 ]
Miratrix, Luke W. [1 ]
机构
[1] Harvard Univ, Grad Sch Educ, Cambridge, MA 02138 USA
关键词
heterogeneous treatment effects; explanatory item response model; causal inference; simulation; psychometrics; LOGISTIC-REGRESSION; RASCH MODEL; PACKAGE;
D O I
10.3102/10769986231171710
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Analyses that reveal how treatment effects vary allow researchers, practitioners, and policymakers to better understand the efficacy of educational interventions. In practice, however, standard statistical methods for addressing heterogeneous treatment effects (HTE) fail to address the HTE that may exist within outcome measures. In this study, we present a novel application of the explanatory item response model (EIRM) for assessing what we term "item-level" HTE (IL-HTE), in which a unique treatment effect is estimated for each item in an assessment. Results from data simulation reveal that when IL-HTE is present but ignored in the model, standard errors can be underestimated and false positive rates can increase. We then apply the EIRM to assess the impact of a literacy intervention focused on promoting transfer in reading comprehension on a digital assessment delivered online to approximately 8,000 third-grade students. We demonstrate that allowing for IL-HTE can reveal treatment effects at the item-level masked by a null average treatment effect, and the EIRM can thus provide fine-grained information for researchers and policymakers on the potentially heterogeneous causal effects of educational interventions.
引用
收藏
页码:889 / 913
页数:25
相关论文
共 47 条
[1]  
American Educational Research Association American Psychological Association & National Council on Measurement in Education, 2014, Standards for educational and psychological testing
[2]   Intermediate and advanced topics in multilevel logistic regression analysis [J].
Austin, Peter C. ;
Merlo, Juan .
STATISTICS IN MEDICINE, 2017, 36 (20) :3257-3277
[3]   When and where do we apply what we learn? A taxonomy for far transfer [J].
Barnett, SM ;
Ceci, SJ .
PSYCHOLOGICAL BULLETIN, 2002, 128 (04) :612-637
[4]   Fitting Linear Mixed-Effects Models Using lme4 [J].
Bates, Douglas ;
Maechler, Martin ;
Bolker, Benjamin M. ;
Walker, Steven C. .
JOURNAL OF STATISTICAL SOFTWARE, 2015, 67 (01) :1-48
[5]  
Bell A., 2019, QUAL QUANT, V53, P1051, DOI [DOI 10.1007/S11135-018-0802-X, DOI 10.1007/S11135-018-0802-X/FIGURES/2]
[6]   Using Multisite Experiments to Study Cross-Site Variation in Treatment Effects: A Hybrid Approach With Fixed Intercepts and a Random Treatment Coefficient [J].
Bloom, Howard S. ;
Raudenbush, Stephen W. ;
Weiss, Michael J. ;
Porter, Kristin .
JOURNAL OF RESEARCH ON EDUCATIONAL EFFECTIVENESS, 2017, 10 (04) :817-842
[7]   Interpreting and Understanding Logits, Probits, and Other Nonlinear Probability Models [J].
Breen, Richard ;
Karlson, Kristian Bernt ;
Holm, Anders .
ANNUAL REVIEW OF SOCIOLOGY, VOL 44, 2018, 44 :39-54
[8]   Using explanatory item response models to analyze group differences in science achievement [J].
Briggs, Derek C. .
APPLIED MEASUREMENT IN EDUCATION, 2008, 21 (02) :89-118
[9]  
Bulut O., 2021, Psych, V3, P308, DOI [10.3390/psych3030023, DOI 10.3390/PSYCH3030023]
[10]  
Burkner, 2019, ARXIV PREPRINT ARXIV