A suggestive approach for assessing item quality, usability and validity of Automatic Item Generation

被引:5
作者
Falcao, Filipe [1 ,2 ,3 ]
Pereira, Daniela Marques [1 ,2 ,3 ]
Goncalves, Nuno [2 ,3 ]
De Champlain, Andre [4 ]
Costa, Patricio [1 ,2 ]
Pego, Jose Miguel [1 ,2 ,3 ]
机构
[1] Univ Minho, Life & Hlth Sci Res Inst ICVS, Sch Med, P-4710057 Braga, Portugal
[2] PT Govt Associate Lab, ICVS 3Bs, Guimaraes, Portugal
[3] ICognitus4All IT Solut, Braga, Portugal
[4] AstraZeneca, Gaithersburg, MD USA
关键词
Automatic Item Generation; Item quality; Item writing; Usability; Validity; RESPONSE THEORY; MULTIPLE-CHOICE; EFFICIENCY; IMPROVE; MODELS; DETECT;
D O I
10.1007/s10459-023-10225-y
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Automatic Item Generation (AIG) refers to the process of using cognitive models to generate test items using computer modules. It is a new but rapidly evolving research area where cognitive and psychometric theory are combined into digital framework. However, assessment of the item quality, usability and validity of AIG relative to traditional item development methods lacks clarification. This paper takes a top-down strong theory approach to evaluate AIG in medical education. Two studies were conducted: Study I-participants with different levels of clinical knowledge and item writing experience developed medical test items both manually and through AIG. Both item types were compared in terms of quality and usability (efficiency and learnability); Study II-Automatically generated items were included in a summative exam in the content area of surgery. A psychometric analysis based on Item Response Theory inspected the validity and quality of the AIG-items. Items generated by AIG presented quality, evidences of validity and were adequate for testing student's knowledge. The time spent developing the contents for item generation (cognitive models) and the number of items generated did not vary considering the participants' item writing experience or clinical knowledge. AIG produces numerous high-quality items in a fast, economical and easy to learn process, even for inexperienced and without clinical training item writers. Medical schools may benefit from a substantial improvement in cost-efficiency in developing test items by using AIG. Item writing flaws can be significantly reduced thanks to the application of AIG's models, thus generating test items capable of accurately gauging students' knowledge.
引用
收藏
页码:1441 / 1465
页数:25
相关论文
共 70 条
[1]  
Albano A. D., 2018, HDB ACCESSIBLE INSTR, P181, DOI DOI 10.1007/978-3-319-71126-312
[2]  
American Educational Research Association, 2018, American Psychological Association, and National Council on Measurement in Education
[3]   Using psychometric technology in educational assessment: The case of a schema-based isomorphic approach to the automatic generation of quantitative reasoning items [J].
Arendasy, Martin ;
Sommer, Markus .
LEARNING AND INDIVIDUAL DIFFERENCES, 2007, 17 (04) :366-383
[4]  
Bejar I. I., 2012, AUTOMATIC ITEM GENER, P50, DOI [10.1163/ej.9789004172067.i-752.40, DOI 10.1163/EJ.9789004172067.I-752.40]
[5]  
Bejar I.I., 2003, J TECHNOLOGY LEARNIN, V2
[6]  
Billings M, 2020, CONSTRUCTING WRITTEN
[7]   When Are Multidimensional Data Unidimensional Enough for Structural Equation Modeling? An Evaluation of the DETECT Multidimensionality Index [J].
Bonifay, Wes E. ;
Reise, Steven P. ;
Scheines, Richard ;
Meijer, Rob R. .
STRUCTURAL EQUATION MODELING-A MULTIDISCIPLINARY JOURNAL, 2015, 22 (04) :504-516
[8]  
Choi J, 2020, Application of Artificial Intelligence to Assessment, P189
[9]   A primer on classical test theory and item response theory for assessments in medical education [J].
De Champlain, Andre F. .
MEDICAL EDUCATION, 2010, 44 (01) :109-117
[10]   Testing the actual equivalence of automatically generated items [J].
de Chiusole, Debora ;
Stefanutti, Luca ;
Anselmi, Pasquale ;
Robusto, Egidio .
BEHAVIOR RESEARCH METHODS, 2018, 50 (01) :39-56