Dimensionality and generalizability of domain-independent performance assessments

被引:15
作者
Baker, EL
Abedi, J
Linn, RL
Niemi, D
机构
[1] UNIV COLORADO, NATL CTR RES EVALUAT STAND & STUDENT TESTING, BOULDER, CO 80309 USA
[2] UNIV MISSOURI, COLUMBIA, MO USA
关键词
D O I
10.1080/00220671.1996.9941205
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Empirical guidance for the design of comparable performance assessments is sorely lacking, A study was conducted to assess the degree to which domain specifications control topic and rater variability, focusing on task generalizability, rater reliability, and scoring rubric dimensionality. Two classes of history students were administered three on-demand, multistep performance tasks a week apart. For each topic, all students completed a Prior Knowledge Test, read primary source materials, and wrote an essay of explanation, Using a theory-based scoring rubric, four trained raters scored all essays, Inter- and intrarater reliabilities and g-study results are reported, Results show relative efficiency for the assessment approach. The dimensionality analysis supported two factors: Deep Understanding and Surface Understanding across the three topics. Prior Knowledge scores and GP;I in history courses correlated with the Deep Understanding elements of the scoring rubric. Implications for design and testing purposes are discussed.
引用
收藏
页码:197 / 205
页数:9
相关论文
共 27 条
[1]  
[Anonymous], 1972, The dependability of behaviourial measurements: Theory of generalzsability for scores and profiles
[2]  
ASCHBACHER PR, 1991, 322 CSE U CAL NAT CT
[3]  
BAKER EL, 1991, TESTING COGNITION
[4]  
BAKER EL, 1990, ANN M AM ED RES ASS
[5]  
BAKER EL, 1990, C ED REF SPONS ROCK
[6]  
BAKER EL, 1991, ANN M AM PSYCH ASS S
[7]  
BAKER EL, 1989, 10 ANN C SOC TEST AN
[8]  
BAKER EL, 1991, DESIGNING SCORING CO
[9]  
BAKER EL, 1992, INSTRUCTIONAL MODELS, P365
[10]  
BAKER EL, 1991, ANN M AM ED RES ASS