Automated estimation of item difficulty for multiple-choice tests: An application of word embedding techniques

被引：24

作者：

Fu-Yuan, Hsu ^{[1
,4
]}

Hahn-Ming, Lee ^{[1
,5
]}

Tao-Hsing, Chang ^{[2
]}

Yao-Ting, Sung ^{[3
]}

机构：

[1] Natl Taiwan Univ Sci & Technol, Dept Comp Sci & Informat Engn, 43,Sec 4,Keelung Rd, Taipei 106, Taiwan

[2] Natl Kaohsiung Univ Appl Sci, Dept Comp Sci & Informat Engn, 415 Jiangong Rd, Kaohsiung 807, Taiwan

[3] Natl Taiwan Normal Univ, Dept Educ Psychol & Counseling, 162,Sec 1,Heping E Rd, Taipei 106, Taiwan

[4] Natl Taiwan Normal Univ, Res Ctr Psychol & Educ Testing, 162,Sec 1,Heping E Rd, Taipei 106, Taiwan

[5] Acad Sinica, Inst Informat Sci, 128,Sec 2,Acad Rd, Taipei 115, Taiwan

来源：

INFORMATION PROCESSING & MANAGEMENT | 2018年 / 54卷 / 06期

关键词：

Multiple-choice item; Item difficulty estimation; Cognitive processing model; Semantic similarity; Word embedding; Machine learning; SPACE MODELS; CHINESE;

D O I：

10.1016/j.ipm.2018.06.007

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Pretesting is the most commonly used method for estimating test item difficulty because it provides highly accurate results that can be applied to assessment development activities. However, pretesting is inefficient, and it can lead to item exposure. Hence, an increasing number of studies have invested considerable effort in researching the automated estimation of item difficulty. Language proficiency tests constitute the majority of researched test topics, while comparatively less research has focused on content subjects. This paper introduces a novel method for the automated estimation of item difficulty for social studies tests. In this study, we explore the difficulty of multiple-choice items, which consist of the following item elements: a question and alternative options. We use learning materials to construct a semantic space using word embedding techniques and project an item's texts into the semantic space to obtain corresponding vectors. Semantic features are obtained by calculating the cosine similarity between the vectors of item elements. Subsequently, these semantic features are sent to a classifier for training and testing. Based on the output of the classifier, an estimation model is created and item difficulty is estimated. Our findings suggest that the semantic similarity between a stem and the options has the strongest impact on item difficulty. Furthermore, the results indicate that the proposed estimation method outperforms pretesting, and therefore, we expect that the proposed approach will complement and partially replace pretesting in future.

引用

页码：969 / 984

页数：16

共 54 条

[1] Abdulghani H.M., 2014, Journal of Health Specialties, V2, P148, DOI DOI 10.4103/1658-600X.142784
[2] Paraphrase identification and semantic text similarity analysis in Arabic news tweets using lexical, syntactic, and semantic features
Al-Smadi, Mohammad
Jaradat, Zain
Al-Ayyoub, Mahmoud
Jararweh, Yaser
[J]. INFORMATION PROCESSING & MANAGEMENT, 2017, 53 (03) : 640 - 652
[3] [Anonymous], 1993, Educational measurement: issues and practice
[4] [Anonymous], 2014, ETS Research Report Series, DOI DOI 10.1002/ETS2.12042
[5] Baroni M., 2014, P 52 ANN M ASS COMP
[6] Bengio Y, 2001, ADV NEUR IN, V13, P932
[7] Bickman L., 2009, The Sage handbook of applied research methods, V2nd, P3, DOI DOI 10.4135/9781483348858
[8] Boldt RF., 1998, ETS RES REPORT SERIE, V1998, pi, DOI [10.1002/j.2333-8504.1998.tb01786.x, DOI 10.1002/J.2333-8504.1998.TB01786.X]
[9] Boldt RF., 1996, ETS RES REPORT SERIE, V1996, pi, DOI [10.1002/j.2333-8504.1996.tb01709.x, DOI 10.1002/J.2333-8504.1996.TB01709.X]
[10] Bond T. G., 2015, BASIC PRINCIPLES RAS, P36

← 1 2 3 4 5 6 →