Generative Pre-trained Transformer 4 makes cardiovascular magnetic resonance reports easy to understand

被引:13
|
作者
Salam, Babak [1 ,2 ]
Kravchenko, Dmitrij [1 ,2 ]
Nowak, Sebastian [1 ,2 ]
Sprinkart, Alois M. [1 ,2 ]
Weinhold, Leonie [3 ]
Odenthal, Anna [1 ]
Mesropyan, Narine [1 ,2 ]
Bischoff, Leon M. [1 ,2 ]
Attenberger, Ulrike [1 ]
Kuetting, Daniel L. [1 ,2 ]
Luetkens, Julian A. [1 ,2 ]
Isaak, Alexander [1 ,2 ]
机构
[1] Univ Hosp Bonn, Dept Diagnost & Intervent Radiol, Venusberg Campus 1, D-53127 Bonn, Germany
[2] Univ Hosp Bonn, Quant Imaging Lab Bonn QILaB, Venusberg Campus 1, D-53127 Bonn, Germany
[3] Univ Hosp Bonn, Dept Med Biometry Informat & Epidemiol, Venusberg Campus 1, D-53127 Bonn, Germany
关键词
Generative Pre-trained Transformers; Cardiovascular magnetic resonance; Artificial intelligence; Text simplification; Large language models; RADIOLOGY REPORTS; READABILITY;
D O I
10.1016/j.jocmr.2024.101035
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background: Patients are increasingly using Generative Pre-trained Transformer 4 (GPT-4) to better understand their own radiology findings. Purpose: To evaluate the performance of GPT-4 in transforming cardiovascular magnetic resonance (CMR) reports into text that is comprehensible to medical laypersons. Methods: ChatGPT with GPT-4 architecture was used to generate three different explained versions of 20 various CMR reports (n = 60) using the same prompt: "Explain the radiology report in a language understandable to a medical layperson". Two cardiovascular radiologists evaluated understandability, factual correctness, completeness of relevant findings, and lack of potential harm, while 13 medical laypersons evaluated the understandability of the original and the GPT-4 reports on a Likert scale (1 "strongly disagree", 5 "strongly agree"). Readability was measured using the Automated Readability Index (ARI). Linear mixed-effects models (values given as median [interquartile range]) and intraclass correlation coefficient (ICC) were used for statistical analysis. Results: GPT-4 reports were generated on average in 52 s +/- 13. GPT-4 reports achieved a lower ARI score (10 [9-12] vs 5 [4-6]; p < 0.001) and were subjectively easier to understand for laypersons than original reports (1 [1] vs 4 [4,5]; p < 0.001). Eighteen out of 20 (90%) standard CMR reports and 2/60 (3%) GPT-generated reports had an ARI score corresponding to the 8th grade level or higher. Radiologists' ratings of the GPT-4 reports reached high levels for correctness (5 [4, 5]), completeness (5 [5]), and lack of potential harm (5 [5]); with "strong agreement" for factual correctness in 94% (113/120) and completeness of relevant findings in 81% (97/120) of reports. Test-retest agreement for layperson understandability ratings between the three simplified reports generated from the same original report was substantial (ICC: 0.62; p < 0.001). Interrater agreement between radiologists was almost perfect for lack of potential harm (ICC: 0.93, p < 0.001) and moderate to substantial for completeness (ICC: 0.76, p < 0.001) and factual correctness (ICC: 0.55, p < 0.001). Conclusion: GPT-4 can reliably transform complex CMR reports into more understandable, layperson-friendly language while largely maintaining factual correctness and completeness, and can thus help convey patientrelevant radiology information in an easy-to-understand manner.
引用
收藏
页数:8
相关论文
共 37 条
  • [1] Generative Pre-trained Transformer 4 analysis of cardiovascular magnetic resonance reports in suspected myocarditis: A multicenter study
    Kaya, Kenan
    Gietzen, Carsten
    Hahnfeldt, Robert
    Zoubi, Maher
    Emrich, Tilman
    Halfmann, Moritz C.
    Sieren, Malte Maria
    Elser, Yannic
    Krumm, Patrick
    Brendel, Jan M.
    Nikolaou, Konstantin
    Haag, Nina
    Borggrefe, Jan
    von Kruechten, Ricarda
    Mueller-Peltzer, Katharina
    Ehrengut, Constantin
    Denecke, Timm
    Hagendorff, Andreas
    Goertz, Lukas
    Gertz, Roman J.
    Bunck, Alexander Christian
    Maintz, David
    Persigehl, Thorsten
    Lennartz, Simon
    Luetkens, Julian A.
    Jaiswal, Astha
    Iuga, Andra Iza
    Pennig, Lenhard
    Kottlors, Jonathan
    JOURNAL OF CARDIOVASCULAR MAGNETIC RESONANCE, 2024, 26 (02)
  • [2] Extracting structured information from unstructured histopathology reports using generative pre-trained transformer 4 (GPT-4)
    Truhn, Daniel
    Loeffler, Chiara M. L.
    Mueller-Franzes, Gustav
    Nebelung, Sven
    Hewitt, Katherine J.
    Brandner, Sebastian
    Bressem, Keno K.
    Foersch, Sebastian
    Kather, Jakob Nikolas
    JOURNAL OF PATHOLOGY, 2024, 262 (03) : 310 - 319
  • [3] Evaluating the performance of Generative Pre-trained Transformer-4 (GPT-4) in standardizing radiology reports
    Hasani, Amir M.
    Singh, Shiva
    Zahergivar, Aryan
    Ryan, Beth
    Nethala, Daniel
    Bravomontenegro, Gabriela
    Mendhiratta, Neil
    Ball, Mark
    Farhadi, Faraz
    Malayeri, Ashkan
    EUROPEAN RADIOLOGY, 2024, 34 (06) : 3566 - 3574
  • [4] The application of Chat Generative Pre-trained Transformer in nursing education
    Liu, Jialin
    Liu, Fan
    Fang, Jinbo
    Liu, Siru
    NURSING OUTLOOK, 2023, 71 (06)
  • [5] Chat Generative Pre-trained Transformer: why we should embrace this technology
    Chavez, Martin R.
    Butler, Thomas S.
    Rekawek, Patricia
    Heo, Hye
    Kinzler, Wendy L.
    AMERICAN JOURNAL OF OBSTETRICS AND GYNECOLOGY, 2023, 228 (06) : 706 - 711
  • [6] The impact of Chat Generative Pre-trained Transformer (ChatGPT) on medical education
    Heng, Jonathan J. Y.
    Teo, Desmond B.
    Tan, L. F.
    POSTGRADUATE MEDICAL JOURNAL, 2023, 99 (1176) : 1125 - 1127
  • [7] Universal skepticism of ChatGPT: a review of early literature on chat generative pre-trained transformer
    Watters, Casey
    Lemanski, Michal K.
    FRONTIERS IN BIG DATA, 2023, 6
  • [8] Evaluating Chat Generative Pre-trained Transformer Responses to Common Pediatric In-toeing Questions
    Amaral, Jason Zarahi
    Schultz, Rebecca J.
    Martin, Benjamin M.
    Taylor, Tristen
    Touban, Basel
    McGraw-Heinrich, Jessica
    McKay, Scott D.
    Rosenfeld, Scott B.
    Smith, Brian G.
    JOURNAL OF PEDIATRIC ORTHOPAEDICS, 2024, 44 (07) : e592 - e597
  • [9] Comparison of Patient Education Materials Generated by Chat Generative Pre-Trained Transformer Versus Experts
    Hung, Ya-Ching
    Chaker, Sara C.
    Sigel, Matthew
    Saad, Mariam
    Slater, Elizabeth D.
    ANNALS OF PLASTIC SURGERY, 2023, 91 (04) : 409 - 412
  • [10] The utility of Chat Generative Pre-trained Transformer as a patient resource in paediatric otolaryngology
    Jongbloed, Walter M.
    Grover, Nancy
    JOURNAL OF LARYNGOLOGY AND OTOLOGY, 2024, : 1115 - 1118