Using Large Language Models to Generate Script Concordance Test in Medical Education: ChatGPT and Claude

被引:1
|
作者
Kiyak, Yavuz Selim [1 ]
Emekli, Emre [2 ]
机构
[1] Gazi Univ, Fac Med, Dept Med Educ & Informat, Ankara, Turkiye
[2] Eskisehir Osmangazi Univ, Fac Med, Dept Radiol, Eskisehir, Turkiye
来源
SPANISH JOURNAL OF MEDICAL EDUCATION | 2025年 / 6卷 / 01期
关键词
script concordance test; clinical reasoning; medical education; artificial intelligence; chatgpt;
D O I
10.6018/edumed.636331
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
We aimed to determine the quality of AI-generated (ChatGPT-4 and Claude 3) Script Concordance Test (SCT) items through an expert panel. We generated SCT items on abdominal radiology using a complex prompt in large language model (LLM) chatbots (ChatGPT-4 and Claude 3 (Sonnet) in April 2024) and evaluated the items' quality through an expert panel of 16 radiologists. Expert panel, which was blind to the origin of the items provided without modifications, independently answered each item and assessed them using 12 quality indicators. Data analysis included descriptive statistics, bar charts to compare responses against accepted forms, and a heatmap to show performance in terms of the quality indicators. SCT items generated by chatbots assess clinical reasoning rather than only factual recall (ChatGPT: 92.50%, Claude: 85.00%). The heatmap indicated that the items were generally acceptable, with most responses favorable across quality indicators (ChatGPT: 71.77%, Claude: 64.23%). The comparison of the bar charts with acceptable and unacceptable forms revealed that 73.33% and 53.33% of the questions in the items can be considered acceptable, respectively, for ChatGPT and Claude. The use of LLMs to generate SCT items can be helpful for medical educators by reducing the required time and effort. Although the prompt provides a good starting point, it remains crucial to review and revise AI-generated SCT items before educational use. The prompt and the custom GPT, "Script Concordance Test Generator", available at https://chatgpt.com/g/g-RlzW5xdc1-script-concordance-test-generator, can streamline SCT item development.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] A Prompt for Generating Script Concordance Test Using ChatGPT, Claude, and Llama Large Language Model Chatbots
    Kiyak, Yavuz Selim
    Emekli, Emre
    SPANISH JOURNAL OF MEDICAL EDUCATION, 2024, 5 (03):
  • [2] Large language models (ChatGPT) in medical education: Embrace or abjure?
    Luke, Nathasha
    Taneja, Reshma
    Ban, Kenneth
    Samarasekera, Dujeepa
    Yap, Celestial T.
    ASIA PACIFIC SCHOLAR, 2023, 8 (04): : 50 - 52
  • [3] Using ChatGPT in Psychiatry to Design Script Concordance Tests in Undergraduate Medical Education: Mixed Methods Study
    Hudon, Alexandre
    Kiepura, Barnabe
    Pelletier, Myriam
    Phan, Veronique
    JMIR MEDICAL EDUCATION, 2024, 10
  • [4] ChatGPT and Other Large Language Models in Medical Education - Scoping Literature Review
    Aster, Alexandra
    Laupichler, Matthias Carl
    Rockwell-Kollmann, Tamina
    Masala, Gilda
    Bala, Ebru
    Raupach, Tobias
    MEDICAL SCIENCE EDUCATOR, 2024, : 555 - 567
  • [5] Script concordance test and continuing medical education: A marriage that can only be successful!
    Sibert, L.
    REVUE DES MALADIES RESPIRATOIRES, 2016, 33 (05) : 329 - 331
  • [6] Large language models (LLM) and ChatGPT: a medical student perspective
    Arosh S. Perera Molligoda Arachchige
    European Journal of Nuclear Medicine and Molecular Imaging, 2023, 50 : 2248 - 2249
  • [7] Using ChatGPT to generate Gendered Language
    Soundararajan, Shweta
    Jeyaraj, Manuela Nayantara
    Delany, Sarah Jane
    2023 31ST IRISH CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COGNITIVE SCIENCE, AICS, 2023,
  • [8] Large language models (LLM) and ChatGPT: a medical student perspective
    Arachchige, Arosh S. Perera Molligoda S.
    EUROPEAN JOURNAL OF NUCLEAR MEDICINE AND MOLECULAR IMAGING, 2023, 50 (08) : 2248 - 2249
  • [9] ChatGPT for good? On opportunities and challenges of large language models for education
    Kasneci, Enkelejda
    Sessler, Kathrin
    Kuechemann, Stefan
    Bannert, Maria
    Dementieva, Daryna
    Fischer, Frank
    Gasser, Urs
    Groh, Georg
    Guennemann, Stephan
    Huellermeier, Eyke
    Krusche, Stepha
    Kutyniok, Gitta
    Michaeli, Tilman
    Nerdel, Claudia
    Pfeffer, Juergen
    Poquet, Oleksandra
    Sailer, Michael
    Schmidt, Albrecht
    Seidel, Tina
    Stadler, Matthias
    Weller, Jochen
    Kuhn, Jochen
    Kasneci, Gjergji
    LEARNING AND INDIVIDUAL DIFFERENCES, 2023, 103
  • [10] Future Potential Challenges of Using Large Language Models Like ChatGPT in Medical Practice
    Sedaghat, Sam
    JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2024, 21 (02) : 344 - 345