The effects of violating standard item writing principles on tests and students: The consequences of using flawed test items on achievement examinations in medical education

被引:128
作者
Downing, SM [1 ]
机构
[1] Univ Illinois, Coll Med, Dept Med Educ, Chicago, IL 60612 USA
关键词
achievement testing in medical education; construct-irrelevant variance (CIV); flawed test items; item difficulty effects from flawed items; item writing principles; multiple-choice questions (MCQs); pass-fail effects from flawed items; standard test items; written tests;
D O I
10.1007/s10459-004-4019-5
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
The purpose of this research was to study the effects of violations of standard multiple-choice item writing principles on test characteristics, student scores, and pass-fail outcomes. Four basic science examinations, administered to year-one and year-two medical students, were randomly selected for study. Test items were classified as either standard or flawed by three independent raters, blinded to all item performance data. Flawed test questions violated one or more standard principles of effective item writing. Thirty-six to sixty-five percent of the items on the four tests were flawed. Flawed items were 0-15 percentage points more difficult than standard items measuring the same construct. Over all four examinations, 646 (53%) students passed the standard items while 575 (47%) passed the flawed items. The median passing rate difference between flawed and standard items was 3.5 percentage points, but ranged from -1 to 35 percentage points. Item flaws had little effect on test score reliability or other psychometric quality indices. Results showed that flawed multiple-choice test items, which violate well established and evidence-based principles of effective item writing, disadvantage some medical students. Item flaws introduce the systematic error of construct-irrelevant variance to assessments, thereby reducing the validity evidence for examinations and penalizing some examinees.
引用
收藏
页码:133 / 143
页数:11
相关论文
共 19 条
[1]  
Albanese M. A, 1993, ED MEASUREMENT ISSUE, V12, P28, DOI DOI 10.1111/J.1745-3992.1993.TB00521.X
[2]  
CASE SM, 1989, P 28 ANN C RES MED E, P167
[3]  
Case SM, 1998, CONSTRUCTING WRITTEN
[4]   THE VALIDITY OF 2 ITEM-WRITING RULES [J].
CREHAN, K ;
HALADYNA, TM .
JOURNAL OF EXPERIMENTAL EDUCATION, 1991, 59 (02) :183-192
[5]  
Dawson- Saunders B., 1989, P 28 ANN C RES MED E, P161
[6]  
Downing S.M., 1991, ANN M NAT COUNC MEAS
[7]   Construct-irrelevant variance and flawed test questions: Do multiple-choice item-writing principles make any difference? [J].
Downing, SM .
ACADEMIC MEDICINE, 2002, 77 (10) :S103-S104
[8]  
DOWNING SM, 1995, APPL MEAN EDUC, V8, P89
[9]  
Frary Robert B., 1991, APPL MEAS EDUC, V4, P115, DOI [10.1207/s15324818ame0402_2, DOI 10.1207/S15324818AME0402_2]
[10]   A review of multiple-choice item-writing guidelines for classroom assessment [J].
Haladyna, TM ;
Downing, SM ;
Rodriguez, MC .
APPLIED MEASUREMENT IN EDUCATION, 2002, 15 (03) :309-334