Performance of Chat Generative Pre-trained Transformer-4o in the Adult Clinical Cardiology Self-Assessment Program

被引:1
|
作者
Malik, Abdulaziz [1 ]
Madias, Christopher [1 ]
Wessler, Benjamin S. [1 ]
机构
[1] Tufts Med Ctr, Cardiovasc Ctr, 800 Washington St, Boston, MA 02111 USA
来源
EUROPEAN HEART JOURNAL - DIGITAL HEALTH | 2024年 / 6卷 / 01期
关键词
Medical education; Artificial intelligence; Large language models;
D O I
10.1093/ehjdh/ztae077
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Aims This study evaluates the performance of OpenAI's latest large language model (LLM), Chat Generative Pre-trained Transformer-4o, on the Adult Clinical Cardiology Self-Assessment Program (ACCSAP). Methods and results Chat Generative Pre-trained Transformer-4o was tested on 639 ACCSAP questions, excluding 45 questions containing video clips, resulting in 594 questions for analysis. The questions included a mix of text-based and static image-based [electrocardiogram (ECG), angiogram, computed tomography (CT) scan, and echocardiogram] formats. The model was allowed one attempt per question. Further evaluation of image-only questions was performed on 25 questions from the database. Chat Generative Pre-trained Transformer-4o correctly answered 69.2% (411/594) of the questions. The performance was higher for text-only questions (73.9%) compared with those requiring image interpretation (55.3%, P < 0.001). The model performed worse on questions involving ECGs, with a correct rate of 56.5% compared with 73.3% for non-ECG questions (P < 0.001). Despite its capability to interpret medical images in the context of a text-based question, the model's accuracy varied, demonstrating strengths and notable gaps in diagnostic accuracy. It lacked accuracy in reading images (ECGs, echocardiography, and angiograms) with no context. Conclusion Chat Generative Pre-trained Transformer-4o performed moderately well on ACCSAP questions. However, the model's performance remains inconsistent, especially in interpreting ECGs. These findings highlight the potential and current limitations of using LLMs in medical education and clinical decision-making.
引用
收藏
页码:155 / 158
页数:4
相关论文
共 35 条
  • [21] Blepharoptosis Consultation with Artificial Intelligence: Aesthetic Surgery Advice and Counseling from Chat Generative Pre-Trained Transformer (ChatGPT)
    Shiraishi, Makoto
    Tanigawa, Koji
    Tomioka, Yoko
    Miyakuni, Ami
    Moriwaki, Yuta
    Yang, Rui
    Oba, Jun
    Okazaki, Mutsumi
    AESTHETIC PLASTIC SURGERY, 2024, 48 (11) : 2057 - 2063
  • [22] Advancement of Generative Pre-trained Transformer Chatbots in Answering Clinical Questions in the Practical Rhinoplasty Guideline
    Shiraishi, Makoto
    Tsuruda, Saori
    Tomioka, Yoko
    Chang, Jinwoo
    Hori, Asei
    Ishii, Saaya
    Fujinaka, Rei
    Ando, Taku
    Ohba, Jun
    Okazaki, Mutsumi
    AESTHETIC PLASTIC SURGERY, 2024, : 1874 - 1880
  • [23] Generative Pre-trained Transformer 4 analysis of cardiovascular magnetic resonance reports in suspected myocarditis: A multicenter study
    Kaya, Kenan
    Gietzen, Carsten
    Hahnfeldt, Robert
    Zoubi, Maher
    Emrich, Tilman
    Halfmann, Moritz C.
    Sieren, Malte Maria
    Elser, Yannic
    Krumm, Patrick
    Brendel, Jan M.
    Nikolaou, Konstantin
    Haag, Nina
    Borggrefe, Jan
    von Kruechten, Ricarda
    Mueller-Peltzer, Katharina
    Ehrengut, Constantin
    Denecke, Timm
    Hagendorff, Andreas
    Goertz, Lukas
    Gertz, Roman J.
    Bunck, Alexander Christian
    Maintz, David
    Persigehl, Thorsten
    Lennartz, Simon
    Luetkens, Julian A.
    Jaiswal, Astha
    Iuga, Andra Iza
    Pennig, Lenhard
    Kottlors, Jonathan
    JOURNAL OF CARDIOVASCULAR MAGNETIC RESONANCE, 2024, 26 (02)
  • [24] Extracting structured information from unstructured histopathology reports using generative pre-trained transformer 4 (GPT-4)
    Truhn, Daniel
    Loeffler, Chiara M. L.
    Mueller-Franzes, Gustav
    Nebelung, Sven
    Hewitt, Katherine J.
    Brandner, Sebastian
    Bressem, Keno K.
    Foersch, Sebastian
    Kather, Jakob Nikolas
    JOURNAL OF PATHOLOGY, 2024, 262 (03) : 310 - 319
  • [25] Exploring the Potential and Limitations of Chat Generative Pre-trained Transformer (ChatGPT) in Generating Board-Style Dermatology Questions: A Qualitative Analysis
    Ayub, Ibraheim
    Hamann, Dathan
    Hamann, Carsten R.
    Davis, Matthew J.
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (08)
  • [26] Is generative pre-trained transformer artificial intelligence (Chat-GPT) a reliable tool for guidelines synthesis? A preliminary evaluation for biologic CRSwNP therapy
    Maniaci, Antonino
    Saibene, Alberto Maria
    Calvo-Henriquez, Christian
    Vaira, Luigi
    Radulesco, Thomas
    Michel, Justin
    Chiesa-Estomba, Carlos
    Sowerby, Leigh
    Lobo Duro, David
    Mayo-Yanez, Miguel
    Maza-Solano, Juan
    Lechien, Jerome Rene
    La Mantia, Ignazio
    Cocuzza, Salvatore
    EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2024, 281 (04) : 2167 - 2173
  • [27] Is generative pre-trained transformer artificial intelligence (Chat-GPT) a reliable tool for guidelines synthesis? A preliminary evaluation for biologic CRSwNP therapy
    Antonino Maniaci
    Alberto Maria Saibene
    Christian Calvo-Henriquez
    Luigi Vaira
    Thomas Radulesco
    Justin Michel
    Carlos Chiesa-Estomba
    Leigh Sowerby
    David Lobo Duro
    Miguel Mayo-Yanez
    Juan Maza-Solano
    Jerome Rene Lechien
    Ignazio La Mantia
    Salvatore Cocuzza
    European Archives of Oto-Rhino-Laryngology, 2024, 281 : 2167 - 2173
  • [28] Performance of a commercially available Generative Pre-trained Transformer (GPT) in describing radiolucent lesions in panoramic radiographs and establishing differential diagnoses
    Thaísa Pinheiro Silva
    Maria Fernanda Silva Andrade-Bortoletto
    Thaís Santos Cerqueira Ocampo
    Caio Alencar-Palha
    Michael M. Bornstein
    Christiano Oliveira-Santos
    Matheus L. Oliveira
    Clinical Oral Investigations, 28
  • [29] Performance of a commercially available Generative Pre-trained Transformer (GPT) in describing radiolucent lesions in panoramic radiographs and establishing differential diagnoses
    Silva, Thaisa Pinheiro
    Andrade-Bortoletto, Maria Fernanda Silva
    Ocampo, Thais Santos Cerqueira
    Alencar-Palha, Caio
    Bornstein, Michael M.
    Oliveira-Santos, Christiano
    Oliveira, Matheus L.
    CLINICAL ORAL INVESTIGATIONS, 2024, 28 (03)
  • [30] Artificial intelligence in orthopaedics: can Chat Generative Pre-trained Transformer (ChatGPT) pass Section 1 of the Fellowship of the Royal College of Surgeons (Trauma & Orthopaedics) examination?
    Cuthbert, Rory
    Simpson, Ashley, I
    POSTGRADUATE MEDICAL JOURNAL, 2023, 99 (1176) : 1110 - 1114