Rarely selected distractors in high stakes medical multiple-choice examinations and their recognition by item authors: a simulation and survey

被引:21
作者
Rogausch, Anja [1 ]
Hofer, Rainer [1 ]
Krebs, Rene [1 ]
机构
[1] Univ Bern, Fac Med, Inst Med Educ, Assessment & Evaluat Unit, CH-3010 Bern, Switzerland
关键词
NONFUNCTIONING OPTIONS; BEST-ANSWER; QUESTIONS; NUMBER; TESTS; TIMES;
D O I
10.1186/1472-6920-10-85
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Background: Many medical exams use 5 options for mutiple choice questions (MCQs), although the literature suggests that 3 options are optimal. previous studies on this topic have often been based on non-medical examinations, so we sought to analyse rarely selected, 'non-functional' distractors (NF-D) in high stakes medical examinations, and their detection by item authors as well as psychometric changes resulting from a reduction in the number of options. Methods: Based on Swiss Federal MCQ examinations from 2005-2007, the frequency of NF-D (selected by <1% or <5% of the candidates) was calculated. Distractors that were chosen the least or second least were identified and candidates who chose them were allocated to the remaining options using two extreme assumptions about their hypothetical behaviour. In case rarely selected distractors were eliminated, candidates could randomly choose another option - or purposively choose the correct answer, from which they had originally been distracted. In a second step, 37 experts were asked to mark the least plausible options. The consequences of a reduction from 4 to 3 or 2 distractors - based on item statistics or on the experts' rating - with respect to difficulty, discrimination and reliability were modelled. Results: About 70% of the 5-option-items had at least 1 NF-D selected by <1% of the candidates (97% for NF-Ds selected by <5%). Only a reduction to 2 distractors and assuming that candidates would switch to the correct answer in the absence of a 'non-functional' distractor led to relevant differences in reliability and difficulty (and to a lesser degree discrimination). The experts' ratings resulted in slightly greater changes compared to the statistical approach. Conclusions: Based on item statistics and/or an expert panel's recommendation, the choice of a varying number of 3-4 (or partly 2) plausible distractors could be performed without marked deteriorations in psychometric characteristics.
引用
收藏
页数:9
相关论文
共 17 条
[1]   A METAANALYTIC INVESTIGATION OF THE EFFECT OF VARIOUS TEST ITEM CHARACTERISTICS ON TEST-SCORES AND TEST COMPLETION TIMES [J].
AAMODT, MG ;
MCSHANE, T .
PUBLIC PERSONNEL MANAGEMENT, 1992, 21 (02) :151-160
[2]  
[Anonymous], 2006, Methodology, DOI DOI 10.1027/1614-2241.2.2.65
[3]   Distractor similarity and item-stem structure: Effects on item difficulty [J].
Ascalon, M. Evelina ;
Meyers, Lawrence S. ;
Davis, Bruce W. ;
Smits, Niels .
APPLIED MEASUREMENT IN EDUCATION, 2007, 20 (02) :153-170
[4]   Using item response theory to explore the psychometric properties of extended matching questions examination in undergraduate medical education [J].
Bhakta B. ;
Tennant A. ;
Horton M. ;
Lawton G. ;
Andrich D. .
BMC Medical Education, 5 (1)
[5]   Nonfunctioning options: A closer look [J].
Cizek, GJ ;
Robinson, KL ;
O'Day, DM .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1998, 58 (04) :605-611
[6]   FURTHER INVESTIGATION OF NONFUNCTIONING OPTIONS IN MULTIPLE-CHOICE TEST ITEMS [J].
CIZEK, GJ ;
ODAY, DM .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1994, 54 (04) :861-872
[7]   Further evidence favoring three-option items in multiple-choice tests [J].
Delgado, AR ;
Prieto, G .
EUROPEAN JOURNAL OF PSYCHOLOGICAL ASSESSMENT, 1998, 14 (03) :197-201
[8]  
Haladyna T.M., 1989, APPL MEAS EDUC, V2, P51, DOI DOI 10.1207/S15324818AME02014
[9]   HOW MANY OPTIONS IS ENOUGH FOR A MULTIPLE-CHOICE TEST ITEM [J].
HALADYNA, TM ;
DOWNING, SM .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1993, 53 (04) :999-1010
[10]   A review of multiple-choice item-writing guidelines for classroom assessment [J].
Haladyna, TM ;
Downing, SM ;
Rodriguez, MC .
APPLIED MEASUREMENT IN EDUCATION, 2002, 15 (03) :309-334