Influence of prior probability information on large language model performance in radiological diagnosis

被引：0

作者：

Fukushima, Takahiro ^{[1
]}

Kurokawa, Ryo ^{[1
]}

Hagiwara, Akifumi ^{[1
]}

Sonoda, Yuki ^{[1
]}

Asari, Yusuke ^{[1
]}

Kurokawa, Mariko ^{[1
]}

Kanzawa, Jun ^{[1
]}

Gonoi, Wataru ^{[1
]}

Abe, Osamu ^{[1
]}

机构：

[1] Univ Tokyo, Grad Sch Med, Dept Radiol, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1138655, Japan

来源：

JAPANESE JOURNAL OF RADIOLOGY | 2025年

关键词：

Large language model; Artificial intelligence; Claude; 3.5; Sonnet; Bayes' theorem;

D O I：

10.1007/s11604-025-01743-3

中图分类号：

R8 [特种医学]; R445 [影像诊断学];

学科分类号：

1002 ; 100207 ; 1009 ;

摘要：

PurposeLarge language models (LLMs) show promise in radiological diagnosis, but their performance may be affected by the context of the cases presented. Our purpose is to investigate how providing information about prior probabilities influences the diagnostic performance of an LLM in radiological quiz cases.Materials and methodsWe analyzed 322 consecutive cases from Radiology's "Diagnosis Please" quiz using Claude 3.5 Sonnet under three conditions: without context (Condition 1), informed as quiz cases (Condition 2), and presented as primary care cases (Condition 3). Diagnostic accuracy was compared using McNemar's test.ResultsThe overall accuracy rate significantly improved in Condition 2 compared to Condition 1 (70.2% vs. 64.9%, p = 0.029). Conversely, the accuracy rate significantly decreased in Condition 3 compared to Condition 1 (59.9% vs. 64.9%, p = 0.027).ConclusionsProviding information that may influence prior probabilities significantly affects the diagnostic performance of the LLM in radiological cases. This suggests that LLMs may incorporate Bayesian-like principles and adjust the weighting of their diagnostic responses based on prior information, highlighting the potential for optimizing LLM's performance in clinical settings by providing relevant contextual information.

引用

页码：934 / 939

页数：6

共 50 条

[41] Evaluation of Large language model performance on the Multi-Specialty Recruitment Assessment (MSRA) exam
Tsoutsanis, Panagiotis
Tsoutsanis, Aristotelis
COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 168
[42] Large language model doctor: assessing the ability of ChatGPT-4 to deliver interventional radiology procedural information to patients during the consent process
Hofmann, Hayden L.
Vairavamurthy, Jenanan
CVIR ENDOVASCULAR, 2024, 7 (01)
[43] Chat GPT as a Neuro-Score Calculator: Analysis of a Large Language Model's Performance on Various Neurological Exam Grading Scales
Chen, Tse Chiang
Kaminski, Emily
Koduri, Laila
Singer, Alyssa
Singer, Jorie
Couldwell, Mitch
Delashaw, Johnny
Dumont, Aaron
Wang, Arthur
WORLD NEUROSURGERY, 2023, 179 : E342 - E347
[44] Assessing the utility of ChatGPT as an artificial intelligence-based large language model for information to answer questions on myopia
Biswas, Sayantan
Logan, Nicola S. S.
Davies, Leon N. N.
Sheppard, Amy L. L.
Wolffsohn, James S. S.
OPHTHALMIC AND PHYSIOLOGICAL OPTICS, 2023, 43 (06) : 1562 - 1570
[45] Pre-trained multimodal large language model enhances dermatological diagnosis using SkinGPT-4
Zhou, Juexiao
He, Xiaonan
Sun, Liyuan
Xu, Jiannan
Chen, Xiuying
Chu, Yuetan
Zhou, Longxi
Liao, Xingyu
Zhang, Bin
Afvari, Shawn
Gao, Xin
NATURE COMMUNICATIONS, 2024, 15 (01)
[46] Extracting Key Information from Unlabeled Patents Based on Knowledge Self-Distillation of Large Language Model
Jianfei, Zhao
Ting, Chen
Xiaomei, Wang
Chong, Feng
Data Analysis and Knowledge Discovery, 2024, 8 (8-9) : 133 - 143
[47] Trends in accuracy and appropriateness of alopecia areata information obtained from a popular online large language model, ChatGPT
O'Hagan, Ross
Kim, Randie H.
Abittan, Brian J.
Caldas, Stella
Ungar, Jonathan
Ungar, Benjamin
DERMATOLOGY, 2023, 239 (06) : 952 - 957
[48] ChatGPT sits the DFPH exam: large language model performance and potential to support public health learning
Nathan P Davies
Robert Wilson
Madeleine S Winder
Simon J Tunster
Kathryn McVicar
Shivan Thakrar
Joe Williams
Allan Reid
BMC Medical Education, 24
[49] Manager sentiment, policy uncertainty, ESG disclosure and firm performance: a large language model in corporate landscape
Sahu, Asis Kumar
Debata, Byomakesh
Dash, Saumya Ranjan
INTERNATIONAL JOURNAL OF ACCOUNTING AND INFORMATION MANAGEMENT, 2024, 32 (05) : 858 - 882
[50] Large language model-based planning agent with generative memory strengthens performance in textualized world
Liu, Junyang
Hao, Wenning
Cheng, Kai
Jin, Dawei
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 148

← 1 2 3 4 5 →