Performance of Chat Generative Pre-trained Transformer-4o in the Adult Clinical Cardiology Self-Assessment Program

被引:1
|
作者
Malik, Abdulaziz [1 ]
Madias, Christopher [1 ]
Wessler, Benjamin S. [1 ]
机构
[1] Tufts Med Ctr, Cardiovasc Ctr, 800 Washington St, Boston, MA 02111 USA
来源
EUROPEAN HEART JOURNAL - DIGITAL HEALTH | 2024年 / 6卷 / 01期
关键词
Medical education; Artificial intelligence; Large language models;
D O I
10.1093/ehjdh/ztae077
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Aims This study evaluates the performance of OpenAI's latest large language model (LLM), Chat Generative Pre-trained Transformer-4o, on the Adult Clinical Cardiology Self-Assessment Program (ACCSAP). Methods and results Chat Generative Pre-trained Transformer-4o was tested on 639 ACCSAP questions, excluding 45 questions containing video clips, resulting in 594 questions for analysis. The questions included a mix of text-based and static image-based [electrocardiogram (ECG), angiogram, computed tomography (CT) scan, and echocardiogram] formats. The model was allowed one attempt per question. Further evaluation of image-only questions was performed on 25 questions from the database. Chat Generative Pre-trained Transformer-4o correctly answered 69.2% (411/594) of the questions. The performance was higher for text-only questions (73.9%) compared with those requiring image interpretation (55.3%, P < 0.001). The model performed worse on questions involving ECGs, with a correct rate of 56.5% compared with 73.3% for non-ECG questions (P < 0.001). Despite its capability to interpret medical images in the context of a text-based question, the model's accuracy varied, demonstrating strengths and notable gaps in diagnostic accuracy. It lacked accuracy in reading images (ECGs, echocardiography, and angiograms) with no context. Conclusion Chat Generative Pre-trained Transformer-4o performed moderately well on ACCSAP questions. However, the model's performance remains inconsistent, especially in interpreting ECGs. These findings highlight the potential and current limitations of using LLMs in medical education and clinical decision-making.
引用
收藏
页码:155 / 158
页数:4
相关论文
共 35 条
  • [1] The impact of Chat Generative Pre-trained Transformer (ChatGPT) on medical education
    Heng, Jonathan J. Y.
    Teo, Desmond B.
    Tan, L. F.
    POSTGRADUATE MEDICAL JOURNAL, 2023, 99 (1176) : 1125 - 1127
  • [2] The application of Chat Generative Pre-trained Transformer in nursing education
    Liu, Jialin
    Liu, Fan
    Fang, Jinbo
    Liu, Siru
    NURSING OUTLOOK, 2023, 71 (06)
  • [3] Chat Generative Pre-trained Transformer: why we should embrace this technology
    Chavez, Martin R.
    Butler, Thomas S.
    Rekawek, Patricia
    Heo, Hye
    Kinzler, Wendy L.
    AMERICAN JOURNAL OF OBSTETRICS AND GYNECOLOGY, 2023, 228 (06) : 706 - 711
  • [4] Performance of Chat Generative Pre-Trained Transformer on Personal Review of Learning in Obstetrics and Gynecology
    Cohen, Adam
    Burns, Jersey
    Gabra, Martina
    Gordon, Alex
    Deebel, Nicholas
    Terlecki, Ryan
    Woodburn, Katherine L.
    SOUTHERN MEDICAL JOURNAL, 2025, 118 (02) : 102 - 105
  • [5] Universal skepticism of ChatGPT: a review of early literature on chat generative pre-trained transformer
    Watters, Casey
    Lemanski, Michal K.
    FRONTIERS IN BIG DATA, 2023, 6
  • [6] The utility of Chat Generative Pre-trained Transformer as a patient resource in paediatric otolaryngology
    Jongbloed, Walter M.
    Grover, Nancy
    JOURNAL OF LARYNGOLOGY AND OTOLOGY, 2024, : 1115 - 1118
  • [7] Evaluating Chat Generative Pre-trained Transformer Responses to Common Pediatric In-toeing Questions
    Amaral, Jason Zarahi
    Schultz, Rebecca J.
    Martin, Benjamin M.
    Taylor, Tristen
    Touban, Basel
    McGraw-Heinrich, Jessica
    McKay, Scott D.
    Rosenfeld, Scott B.
    Smith, Brian G.
    JOURNAL OF PEDIATRIC ORTHOPAEDICS, 2024, 44 (07) : e592 - e597
  • [8] Using the Chat Generative Pre-trained Transformer in academic writing in health: a scoping review
    Costa, Isabelle Cristinne Pinto
    do Nascimento, Murilo Cesar
    Treviso, Patricia
    Chini, Lucelia Terra
    Roza, Bartira de Aguiar
    Barbosa, Sayonara De Fatima Faria
    Mendes, Karina Dal Sasso
    REVISTA LATINO-AMERICANA DE ENFERMAGEM, 2024, 32
  • [9] Using the Chat Generative Pre-trained Transformer in academic health writing: a scoping review
    Pinto Costa, Isabelle Cristinne
    do Nascimento, Murilo Cesar
    Treviso, Patricia
    Chini, Lucelia Terra
    Roza, Bartira de Aguiar
    Barbosa, Sayonara De Fatima Faria
    Mendes, Karina Dal Sasso
    REVISTA LATINO-AMERICANA DE ENFERMAGEM, 2024, 32
  • [10] Using the Chat Generative Pre-trained Transformer in academic writing in health: a scoping review
    Costa, Isabelle Cristinne Pinto
    do Nascimento, Murilo Cesar
    Treviso, Patricia
    Chini, Lucelia Terra
    Roza, Bartira de Aguiar
    Barbosa, Sayonara De Fatima Faria
    Mendes, Karina Dal Sasso
    REVISTA LATINO-AMERICANA DE ENFERMAGEM, 2024, 32