A Propensity Score Method for Investigating Differential Item Functioning in Performance Assessment

被引:12
作者
Chen, Michelle Y. [1 ]
Liu, Yan [2 ]
Zumbo, Bruno D. [2 ]
机构
[1] Paragon Testing Enterprises, 110-2925 Virtual Way, Vancouver, BC V5M 4X5, Canada
[2] Univ British Columbia, Vancouver, BC, Canada
关键词
differential item functioning (DIF); performance assessment; propensity score matching; mixed effects model; validation; writing assessment; DESIGN SENSITIVITY; TESTS; DIF;
D O I
10.1177/0013164419878861
中图分类号
G44 [教育心理学];
学科分类号
0402 ; 040202 ;
摘要
This study introduces a novel differential item functioning (DIF) method based on propensity score matching that tackles two challenges in analyzing performance assessment data, that is, continuous task scores and lack of a reliable internal variable as a proxy for ability or aptitude. The proposed DIF method consists of two main stages. First, propensity score matching is used to eliminate preexisting group differences before the test, ideally creating equivalent groups as in a randomized experimental study. Then, linear mixed effects models are adopted to perform DIF analysis based on the matched data set. We demonstrate this propensity DIF method using a high-stakes functional English language proficiency test. DIF due to education was investigated in the writing component, which consists of two continuously scored performance-based tasks. Although the proposed method is demonstrated in the context of language testing, it can be applied to other types of performance assessments.
引用
收藏
页码:476 / 498
页数:23
相关论文
共 59 条
[1]  
[Anonymous], 1999, A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores
[2]  
[Anonymous], 2014, Standards for educational and psychological testing
[3]  
Austin PC, 2008, STAT MED, V27, P2037, DOI 10.1002/sim.3150
[4]   A comparison of 12 algorithms for matching on the propensity score [J].
Austin, Peter C. .
STATISTICS IN MEDICINE, 2014, 33 (06) :1057-1069
[5]  
Bai H., 2015, Propensity Score Analysis: Fundamentals and Developments, P74
[6]   Fast and Elegant Numerical Linear Algebra Using the RcppEigen Package [J].
Bates, Douglas ;
Eddelbuettel, Dirk .
JOURNAL OF STATISTICAL SOFTWARE, 2013, 52 (05) :1-24
[7]  
Bolt D., 1996, BEHAVIORMETRIKA, V23, P67
[8]  
BOWEN DF, 2011, THESIS
[9]   Just one question: If one question works, why ask several? [J].
Bowling, A .
JOURNAL OF EPIDEMIOLOGY AND COMMUNITY HEALTH, 2005, 59 (05) :342-345
[10]   Investigating uniform and non-uniform gender DIF in computer-based ESL writing assessment [J].
Breland, Hunter ;
Lee, Yong-Won .
APPLIED MEASUREMENT IN EDUCATION, 2007, 20 (04) :377-403