ChatGPT-4 Consistency in Interpreting Laryngeal Clinical Images of Common Lesions and Disorders

被引:2
作者
Maniaci, Antonino [1 ,2 ]
Chiesa-Estomba, Carlos M. [1 ,3 ,4 ]
Lechien, Jerome R. [1 ,3 ,5 ,6 ]
机构
[1] Res Comm Young Otolaryngologists Int Federat Otorh, Paris, France
[2] Kore Univ, Dept Med & Surg, Enna, Italy
[3] Univ Mons UMons, UMONS Res Inst Hlth Sci & Technol, Div Laryngol & Bronchoesophagol, Dept Otolaryngol Head Neck Surg,EpiCURA Hosp, Mons, Belgium
[4] Donostia Univ Hosp Donostia San, Dept Otorhinolaryngol Head & Neck Surg, Sebastian, Spain
[5] Paris Saclay Univ, Univ Sorbonne Nouvelle Paris 3, Dept Otorhinolaryngol & Head & Neck Surg, Foch Hosp,Phonet Phonol Lab,CNRS,UMR 7018, Paris, France
[6] CHU St Pierre, Dept Otorhinolaryngol Head & Neck Surg, Brussels, Belgium
关键词
accuracy; artificial intelligence; ChatGPT; GPT; head neck surgery; images; laryngology; otolaryngology; picture; video;
D O I
10.1002/ohn.897
中图分类号
R76 [耳鼻咽喉科学];
学科分类号
100213 ;
摘要
ObjectiveTo investigate the consistency of Chatbot Generative Pretrained Transformer (ChatGPT)-4 in the analysis of clinical pictures of common laryngological conditions.Study DesignProspective uncontrolled study.SettingMulticenter study.MethodsPatient history and clinical videolaryngostroboscopic images were presented to ChatGPT-4 for differential diagnoses, management, and treatment(s). ChatGPT-4 responses were assessed by 3 blinded laryngologists with the artificial intelligence performance instrument (AIPI). The complexity of cases and the consistency between practitioners and ChatGPT-4 for interpreting clinical images were evaluated with a 5-point Likert Scale. The intraclass correlation coefficient (ICC) was used to measure the strength of interrater agreement.ResultsForty patients with a mean complexity score of 2.60 +/- 1.15. were included. The mean consistency score for ChatGPT-4 image interpretation was 2.46 +/- 1.42. ChatGPT-4 perfectly analyzed the clinical images in 6 cases (15%; 5/5), while the consistency between GPT-4 and judges was high in 5 cases (12.5%; 4/5). Judges reported an ICC of 0.965 for the consistency score (P = .001). ChatGPT-4 erroneously documented vocal fold irregularity (mass or lesion), glottic insufficiency, and vocal cord paralysis in 21 (52.5%), 2 (0.05%), and 5 (12.5%) cases, respectively. ChatGPT-4 and practitioners indicated 153 and 63 additional examinations, respectively (P = .001). The ChatGPT-4 primary diagnosis was correct in 20.0% to 25.0% of cases. The clinical image consistency score was significantly associated with the AIPI score (rs = 0.830; P = .001).ConclusionThe ChatGPT-4 is more efficient in primary diagnosis, rather than in the image analysis, selecting the most adequate additional examinations and treatments.
引用
收藏
页码:1106 / 1113
页数:8
相关论文
共 21 条
  • [1] Potential Applications and Impact of ChatGPT in Radiology
    Bajaj, Suryansh
    Gandhi, Darshan
    Nayar, Divya
    [J]. ACADEMIC RADIOLOGY, 2024, 31 (04) : 1256 - 1261
  • [2] ChatGPT in Nuclear Medicine Education
    Currie, Geoffrey
    Barry, Kym
    [J]. JOURNAL OF NUCLEAR MEDICINE TECHNOLOGY, 2023, 51 (03) : 247 - 254
  • [3] ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations
    Dave, Tirth
    Athaluri, Sai Anirudh
    Singh, Satyam
    [J]. FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6
  • [4] Evaluation of Oropharyngeal Cancer Information from Revolutionary Artificial Intelligence Chatbot
    Davis, Ryan J.
    Ayo-Ajibola, Oluwatobiloba
    Lin, Matthew E.
    Swanson, Mark S.
    Chambers, Tamara N.
    Kwon, Daniel I.
    Kokot, Niels C.
    [J]. LARYNGOSCOPE, 2024, 134 (05) : 2252 - 2257
  • [5] Assessing the accuracy of ChatGPT references in head and neck and ENT disciplines
    Frosolini, Andrea
    Franz, Leonardo
    Benedetti, Simone
    Vaira, Luigi Angelo
    de Filippis, Cosimo
    Gennaro, Paolo
    Marioni, Gino
    Gabriele, Guido
    [J]. EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2023, 280 (11) : 5129 - 5133
  • [6] Hudgins A., 2024, PATIENTS ARE USING C
  • [7] Accuracy of GPT-4 in histopathological image detection and classification of colorectal adenomas
    Laohawetwanit, Thiyaphat
    Namboonlue, Chutimon
    Apornvirat, Sompon
    [J]. JOURNAL OF CLINICAL PATHOLOGY, 2024, : 202 - 207
  • [8] Heart-to-heart with ChatGPT: the impact of patients consulting AI for cardiovascular health advice
    Lautrup, Anton Danholt
    Hyrup, Tobias
    Schneider-Kamp, Anna
    Dahl, Marie
    Lindholt, Jes Sanddal
    Schneider-Kamp, Peter
    [J]. OPEN HEART, 2023, 10 (02):
  • [9] Applications of ChatGPT in Otolaryngology-Head Neck Surgery: A State of the Art Review
    Lechien, Jerome R.
    Rameau, Anais
    [J]. OTOLARYNGOLOGY-HEAD AND NECK SURGERY, 2024, 171 (03) : 667 - 677
  • [10] Performance and Consistency of ChatGPT-4 Versus Otolaryngologists: A Clinical Case Series
    Lechien, Jerome R.
    Naunheim, Mattheuw R.
    Maniaci, Antonino
    Radulesco, Thomas
    Saibene, Alberto M.
    Chiesa-Estomba, Carlos M.
    Vaira, Luigi A.
    [J]. OTOLARYNGOLOGY-HEAD AND NECK SURGERY, 2024, 170 (06) : 1519 - 1526