Scoring rubric reliability and internal validity in rater-mediated EFL writing assessment: Insights from many-facet Rasch measurement

被引:10
作者
Li, Wentao [1 ]
机构
[1] Univ British Columbia, Vancouver, BC, Canada
关键词
Scoring rubric; Rater-mediated assessment; EFL writing; Validation; Many-facet Rasch measurement; PEER ASSESSMENT; STUDENTS; FEEDBACK; QUALITY;
D O I
10.1007/s11145-022-10279-1
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Scoring rubrics are known to be effective for assessing writing for both testing and classroom teaching purposes. How raters interpret the descriptors in a rubric can significantly impact the subsequent final score, and further, the descriptors may also color a rater's judgment of a student's writing quality. Little is known, however, about how peer raters use a teacher-developed scoring rubric in English as a Foreign Language (EFL) writing contexts. In the current study, Many-Facet Rasch Measurement (MFRM) was applied to examine a scoring rubric for EFL writing and analyze the severity and consistency in rating behaviors between teachers and peer raters. The findings revealed four key points: (1) students' writing skills can be differentiated by this scoring rubric which measured the construct as expected; (2) four rating criteria were reasonably designed, but the scoring bands were not wide enough to separate students' writing ability; (3) teachers were stricter raters than student peers, but both showed a central tendency effect; and (4) teachers had outstanding intra-rater reliability, while peers showed higher inter-rater reliability. The findings provide some evidence for the reliability and internal validity of the scoring rubric and indicate that teachers could use it for assessing EFL writing; however, the scoring bands would need to be broadened to be applied to a broader range of students' writing levels. Implications for introducing scoring rubrics in peer-mediated assessment for teaching writing, developing scoring rubrics in EFL writing assessment, and using MFRM to evaluate scoring rubrics are discussed.
引用
收藏
页码:2409 / 2431
页数:23
相关论文
共 49 条
[1]  
Ajjawi R., 2018, Developing Evaluative Judgement in Higher Education, P23, DOI 10.4324/9781315109251-2
[2]   A comprehensive review of Rasch measurement in language assessment: Recommendations and guidelines for research [J].
Aryadoust, Vahid ;
Ng, Li Ying ;
Sayama, Hiroki .
LANGUAGE TESTING, 2021, 38 (01) :6-40
[4]   Learning from giving feedback: a study of secondary-level students [J].
Berggren, Jessica .
ELT JOURNAL, 2015, 69 (01) :58-70
[5]  
Bond T., 2015, APPL RASCH MODEL FUN, V3rd, DOI DOI 10.4324/9781315814698
[6]   Developing rubrics to assess the reading-into-writing skills: A case study [J].
Chan, Sathena ;
Inoue, Chihiro ;
Taylor, Lynda .
ASSESSING WRITING, 2015, 26 :20-37
[7]   Effect sizes and research directions of peer assessments: From an integrated perspective of meta-analysis and co-citation network [J].
Chang, Ching-Yi ;
Lee, De-Chih ;
Tang, Kai-Yu ;
Hwang, Gwo-Jen .
COMPUTERS & EDUCATION, 2021, 164 (164)
[8]  
Chen JL., 2016, J CHINA EXAMINATIONS, V1, P29
[9]   Validity and reliability of scaffolded peer assessment of writing from instructor and student perspectives [J].
Cho, Kwangsu ;
Schunn, Christian D. ;
Wilson, Roy W. .
JOURNAL OF EDUCATIONAL PSYCHOLOGY, 2006, 98 (04) :891-901
[10]  
Crusan D., 2010, Assessment in the second language writing classroom, DOI [DOI 10.3998/MPUB.770334, https://doi.org/10.3998/mpub.770334]