Utility of a multimodal computer-based assessment format for assessment with a higher degree of reliability and validity

被引:0
作者
Renes, Johan [1 ]
van der Vleuten, Cees P. M. [2 ]
Collares, Carlos F. [2 ,3 ,4 ]
机构
[1] Maastricht Univ, Dept Human Biol, Maastricht, Netherlands
[2] Maastricht Univ, Fac Hlth Med & Life Sci, Dept Educ Res & Dev, Maastricht, Netherlands
[3] European Board Med Assessors, Edinburgh, Midlothian, Scotland
[4] Stichting Aphasiahelp, Maastricht, Netherlands
关键词
Computer-based assessment; design-based experiment; crossover design; psychometrics; Rasch model; MULTIPLE-CHOICE QUESTIONS; CLINICAL DECISION-MAKING; ITEM RESPONSE THEORY; PROFESSIONAL COMPETENCE; MEDICAL-EDUCATION; STRENGTHS; STUDENTS; IMPACT; TESTS; GUIDE;
D O I
10.1080/0142159X.2022.2137011
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Multiple choice questions (MCQs) suffer from cueing, item quality and factual knowledge testing. This study presents a novel multimodal test containing alternative item types in a computer-based assessment (CBA) format, designated as Proxy-CBA. The Proxy-CBA was compared to a standard MCQ-CBA, regarding validity, reliability, standard error of measurement, and cognitive load, using a quasi-experimental crossover design. Biomedical students were randomized into two groups to sit a 65-item formative exam starting with the MCQ-CBA followed by the Proxy-CBA (group 1, n = 38), or the reverse (group 2, n = 35). Subsequently, a questionnaire on perceived cognitive load was taken, answered by 71 participants. Both CBA formats were analyzed according to parameters of the Classical Test Theory and the Rasch model. Compared to the MCQ-CBA, the Proxy-CBA had lower raw scores (p < 0.001, eta(2) = 0.276), higher reliability estimates (p < 0.001, eta(2) = 0.498), lower SEM estimates (p < 0.001, eta(2) = 0.807), and lower theta ability scores (p < 0.001, eta(2) = 0.288). The questionnaire revealed no significant differences between both CBA tests regarding perceived cognitive load. Compared to the MCQ-CBA, the Proxy-CBA showed increased reliability and a higher degree of validity with similar cognitive load, suggesting its utility as an alternative assessment format.
引用
收藏
页码:433 / 441
页数:9
相关论文
共 51 条
[1]   Analysis and challenges of robust E-exams performance under COVID-19 [J].
Ahmed, Fatima Rayan Awad ;
Ahmed, Thowiba E. ;
Saeed, Rashid A. ;
Alhumyani, Hesham ;
Abdel-Khalek, S. ;
Abu-Zinadah, Hanaa .
RESULTS IN PHYSICS, 2021, 23
[2]  
Al-Rukban MO, 2006, J FAM COMMUNITY MED, V13, P125
[3]  
Birenbaum M., 1994, Studies in Educational Evaluation, V20, P239, DOI DOI 10.1016/0191-491X(94)90011-6
[4]   Teacher use of digital technologies for school-based assessment: a scoping review [J].
Blundell, Christopher N. .
ASSESSMENT IN EDUCATION-PRINCIPLES POLICY & PRACTICE, 2021, 28 (03) :279-300
[5]   Generalizability Theory and Classical Test Theory [J].
Brennan, Robert L. .
APPLIED MEASUREMENT IN EDUCATION, 2011, 24 (01) :1-21
[6]  
Buerger S., 2016, Psychological Test and Assessment Modeling, V58, P597
[7]   Using computers for assessment in medicine [J].
Cantillon, P ;
Irish, B ;
Sales, D .
BMJ-BRITISH MEDICAL JOURNAL, 2004, 329 (7466) :606-609
[8]  
Case S.M., 2002, CONSTRUCTING WRITTEN, V3rd
[9]   Comparison of formula and number-right scoring in undergraduate medical training: a Rasch model analysis [J].
Cecilio-Fernandes, Dario ;
Medema, Harro ;
Collares, Carlos Fernando ;
Schuwirth, Lambert ;
Cohen-Schotanus, Janke ;
Tio, Rene A. .
BMC MEDICAL EDUCATION, 2017, 17
[10]   Is Rasch model analysis applicable in small sample size pilot studies for assessing item characteristics? An example using PROMIS pain behavior item bank data [J].
Chen, Wen-Hung ;
Lenderking, William ;
Jin, Ying ;
Wyrwich, Kathleen W. ;
Gelhorn, Heather ;
Revicki, Dennis A. .
QUALITY OF LIFE RESEARCH, 2014, 23 (02) :485-493