Comparative analysis of GPT-4-based ChatGPT's diagnostic performance with radiologists using real-world radiology reports of brain tumors

被引:7
作者
Mitsuyama, Yasuhito [1 ]
Tatekawa, Hiroyuki [1 ]
Takita, Hirotaka [1 ]
Sasaki, Fumi [1 ]
Tashiro, Akane [1 ]
Oue, Satoshi [1 ]
Walston, Shannon L. [1 ]
Nonomiya, Yuta [2 ]
Shintani, Ayumi [2 ]
Miki, Yukio [1 ]
Ueda, Daiju [1 ,3 ]
机构
[1] Osaka Metropolitan Univ, Grad Sch Med, Dept Diag & Intervent Radiol, 1-4-3 Asahi Machi,Abeno Ku, Osaka 5458585, Japan
[2] Osaka Metropolitan Univ, Grad Sch Med, Dept Med Stat, 1-4-3 Asahi Machi,Abeno Ku, Osaka 5458585, Japan
[3] Osaka Metropolitan Univ, Ctr Hlth Sci Innovat, 1-4-3 Asahi Machi,Abeno Ku, Osaka 5458585, Japan
关键词
Artificial intelligence; Natural language processing; Radiology; Magnetic resonance imaging; Brain tumor;
D O I
10.1007/s00330-024-11032-8
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
ObjectivesLarge language models like GPT-4 have demonstrated potential for diagnosis in radiology. Previous studies investigating this potential primarily utilized quizzes from academic journals. This study aimed to assess the diagnostic capabilities of GPT-4-based Chat Generative Pre-trained Transformer (ChatGPT) using actual clinical radiology reports of brain tumors and compare its performance with that of neuroradiologists and general radiologists.MethodsWe collected brain MRI reports written in Japanese from preoperative brain tumor patients at two institutions from January 2017 to December 2021. The MRI reports were translated into English by radiologists. GPT-4 and five radiologists were presented with the same textual findings from the reports and asked to suggest differential and final diagnoses. The pathological diagnosis of the excised tumor served as the ground truth. McNemar's test and Fisher's exact test were used for statistical analysis.ResultsIn a study analyzing 150 radiological reports, GPT-4 achieved a final diagnostic accuracy of 73%, while radiologists' accuracy ranged from 65 to 79%. GPT-4's final diagnostic accuracy using reports from neuroradiologists was higher at 80%, compared to 60% using those from general radiologists. In the realm of differential diagnoses, GPT-4's accuracy was 94%, while radiologists' fell between 73 and 89%. Notably, for these differential diagnoses, GPT-4's accuracy remained consistent whether reports were from neuroradiologists or general radiologists.ConclusionGPT-4 exhibited good diagnostic capability, comparable to neuroradiologists in differentiating brain tumors from MRI reports. GPT-4 can be a second opinion for neuroradiologists on final diagnoses and a guidance tool for general radiologists and residents.Clinical relevance statementThis study evaluated GPT-4-based ChatGPT's diagnostic capabilities using real-world clinical MRI reports from brain tumor cases, revealing that its accuracy in interpreting brain tumors from MRI findings is competitive with radiologists.Key Points...
引用
收藏
页码:1938 / 1947
页数:10
相关论文
共 30 条
[1]  
Achiam OJ, 2023, Arxiv, DOI [arXiv:2303.08774, 10.48550/arXiv.2303.08774, DOI 10.48550/ARXIV.2303.08774]
[2]   A comparison of ChatGPT-generated articles with human-written articles [J].
Ariyaratne, Sisith ;
Iyengar, Karthikeyan. P. ;
Nischal, Neha ;
Chitti Babu, Naparla ;
Botchu, Rajesh .
SKELETAL RADIOLOGY, 2023, 52 (09) :1755-1758
[3]  
Brown TB, 2020, Arxiv, DOI [arXiv:2005.14165, DOI 10.48550/ARXIV.2005.14165]
[4]   Performance of ChatGPT on a Radiology Board-style Examination: Insights into Current Strengths and Limitations [J].
Bhayana, Rajesh ;
Krishna, Satheesh ;
Bleakney, Robert R. .
RADIOLOGY, 2023, 307 (05)
[5]  
Bossuyt PM, 2015, BMJ-BRIT MED J, V351, DOI [10.1136/bmj.h5527, 10.1373/clinchem.2015.246280, 10.1148/radiol.2015151516]
[6]   The role of specialist neuroradiology second opinion reporting: is there added value? [J].
Briggs, G. M. ;
Flynn, P. A. ;
Worthington, M. ;
Rennie, I. ;
McKinstry, C. S. .
CLINICAL RADIOLOGY, 2008, 63 (07) :791-795
[7]  
Bubeck S, 2023, Arxiv, DOI arXiv:2303.12712
[8]  
Eloundou T, 2023, Arxiv, DOI [arXiv:2303.10130, 10.3386/w31161, DOI 10.48550/ARXIV.2303.10130]
[9]   Progress on the diagnosis and evaluation of brain tumors [J].
Gao, Huila ;
Jiang, Xinguo .
CANCER IMAGING, 2013, 13 (04) :466-481
[10]  
Gertz Roman Johannes, 2023, Radiology, V307, pe230877, DOI 10.1148/radiol.230877