Content-based methods in peer assessment of open-response questions to grade students as authors and as graders

被引：16

作者：

Luaces, Oscar ^{[1
]}

Diez, Jorge ^{[1
]}

Alonso-Betanzos, Amparo

Troncoso, Alicia ^{[2
,3
]}

Bahamonde, Antonio ^{[1
]}

机构：

[1] Univ Oviedo, Ctr Artificial Intelligence, Gijon 33204, Spain

[2] Univ A Coruna, Fac Informat, Dept Comp Sci, La Coruna 15071, Spain

[3] Pablo Olavide Univ, Dept Comp Sci, Seville 41013, Spain

来源：

KNOWLEDGE-BASED SYSTEMS | 2017年 / 117卷

关键词：

Peer assessment; Factorization; Preference learning; Grading graders; MOOCs;

D O I：

10.1016/j.knosys.2016.06.024

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Massive Open Online Courses (MOOCs) use different types of assignments in order to evaluate student knowledge. Multiple-choice tests are particularly apt given the possibility for automatic assessment of large numbers of assignments. However, certain skills require open responses that cannot be assessed automatically yet their evaluation by instructors or teaching assistants is unfeasible given the large number of students. A potentially effective solution is peer assessment whereby students grade the answers of other students. However, to avoid bias due to inexperience, such grades must be filtered. We describe a factorization approach to grading, as a scalable method capable of dealing with very high volumes of data. Our method is also capable of representing open-response content using a vector space model of the answers. Since reliable peer assessment requires students to make coherent assessments, students can be motivated by their assessments reflecting not only their own answers but also their efforts as graders. The method described is able to tackle both these aspects simultaneously. Finally, for a real-world university setting in Spain, we compared grades obtained by our method and grades awarded by university instructors, with results indicating a notable improvement from using a content-based approach. There was no evidence that instructor grading would have led to more accurate grading outcomes than the assessment produced by our models. (C) 2016 Elsevier B.V. All rights reserved.

引用

页码：79 / 87

页数：9

共 36 条

[1] Aggarwal V., 2013, NIPS WORKSH DAT DRIV
[2] [Anonymous], FLAIRS C
[3] [Anonymous], 2013, NIPS WORKSH DAT DRIV
[4] [Anonymous], 2002, KDD 2002
[5] Bahamonde A., 2004, P 21 INT C MACH LEAR, P49
[6] Barnett W., 2003, Q J AUSTRIAN EC, V6, P41, DOI DOI 10.1007/S12113-003-1012-4
[7] A semantic analysis approach for assessing professionalism using free-form text entered online
Blake, Roger
Gutierrez, Oscar
[J]. COMPUTERS IN HUMAN BEHAVIOR, 2011, 27 (06) : 2249 - 2262
[8] Large-Scale Machine Learning with Stochastic Gradient Descent
Bottou, Leon
[J]. COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186
[9] Carterette B, 2008, LECT NOTES COMPUT SC, V4956, P16
[10] DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO

← 1 2 3 4 →