Clinical application potential of large language model: a study based on thyroid nodules

被引：4

作者：

Xia, Shujun ^{[1
,2
]}

Hua, Qing ^{[1
]}

Mei, Zihan ^{[1
]}

Xu, Wenwen ^{[1
]}

Lai, Limei ^{[1
]}

Wei, Minyan ^{[1
]}

Qin, Yu ^{[1
]}

Luo, Lin ^{[3
]}

Wang, Changhua ^{[4
]}

Huo, ShengNan ^{[5
]}

Fu, Lijun ^{[6
]}

Zhou, Feidu ^{[7
]}

Wu, Jiang ^{[8
]}

Zhang, Li ^{[9
]}

Lv, De ^{[10
]}

Li, Jianxin ^{[11
]}

Wang, Xin ^{[12
]}

Li, Ning ^{[13
]}

Song, Yanyan ^{[14
]}

Zhou, Jianqiao ^{[1
,2
]}

机构：

[1] Shanghai Jiao Tong Univ, Sch Med, Ruijin Hosp, Dept Ultrasound, Shanghai, Peoples R China

[2] Shanghai Jiao Tong Univ, Coll Hlth Sci & Technol, Sch Med, Shanghai, Peoples R China

[3] Kongjiang Hosp, Dept Endocrinol, Shanghai, Peoples R China

[4] Xianning 1 Peoples Hosp, Dept Thyroid & Breast Surg, Xianning, Peoples R China

[5] Handan Hangang Hosp, Dept Thyroid, Handan, Hebei, Peoples R China

[6] Zhengzhou Univ, Dept Thyroid Surg, Affiliated Hosp 1, Zhengzhou, Peoples R China

[7] LiuYang Peoples Hosp, Thyroid & Breast Surg, Changsha, Peoples R China

[8] Fourth Mil Med Univ, Xijing Hosp, Dept Thyroid Breast & Vasc Surg, Xian, Shanxi, Peoples R China

[9] Shanxi Prov Canc Hosp, Dept Head & Neck Surg, Taiyuan, Peoples R China

[10] Hosp Chengdu Univ Tradit Chinese Med, Dept Endocrinol, Chengdu, Peoples R China

[11] Mazhanghuiwen Hosp, Dept Surg, Zhanjiang, Guangdong, Peoples R China

[12] Lianshui Peoples Hosp, Endocrine Dept, Huaian, Jiangsu, Peoples R China

[13] Kunming Univ Sci & Technol, Anning Peoples Hosp 1, Dept Ultrasound, Anning, Yunnan, Peoples R China

[14] Shanghai Jiao Tong Univ, Sch Med, Inst Med Sci, Dept Biostat, Shanghai, Peoples R China

来源：

ENDOCRINE | 2025年 / 87卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Artificial intelligence; LLM; ChatGPT; New Bing Chat; ACTIVE SURVEILLANCE; CANCER; MANAGEMENT; AMERICAN;

D O I：

10.1007/s12020-024-03981-3

中图分类号：

R5 [内科学];

学科分类号：

1002 ; 100201 ;

摘要：

Background Limited data indicated the performance of large language model (LLM) taking on the role of doctors. We aimed to investigate the potential for ChatGPT-3.5 and New Bing Chat acting as doctors using thyroid nodules as an example. Methods A total of 145 patients with thyroid nodules were included for generating questions. Each question was entered into chatbot of ChatGPT-3.5 and New Bing Chat five times and five responses were acquired respectively. These responses were compared with answers given by five junior doctors. Responses from five senior doctors were regarded as gold standard. Accuracy and reproducibility of responses from ChatGPT-3.5 and New Bing Chat were evaluated. Results The accuracy of ChatGPT-3.5 and New Bing Chat in answering Q2, Q3, Q5 were lower than that of junior doctors (all P < 0.05), while both LLMs were comparable to junior doctors when answering Q4 and Q6. In terms of "high reproducibility and accuracy", ChatGPT-3.5 outperformed New Bing Chat in Q1 and Q5 (P < 0.001 and P = 0.008, respectively), but showed no significant difference in Q2, Q3, Q4, and Q6 (P > 0.05 for all). New Bing Chat generated higher accuracy than ChatGPT-3.5 (72.41% vs 58.62%) (P = 0.003) in decision making of thyroid nodules, and both were less accurate than junior doctors (89.66%, P < 0.001 for both). Conclusions The exploration of ChatGPT-3.5 and New Bing Chat in the diagnosis and management of thyroid nodules illustrates that LLMs currently demonstrate the potential for medical applications, but do not yet reach the clinical decision-making capacity of doctors.

引用

页码：206 / 213

页数：8

共 26 条

[1]

Abdel-Messih MS, 2023, JMIR MED EDUC, V9, DOI [10.2196/46876, 10.2196/46876]

[2] Active Surveillance for Low-Risk Differentiated Thyroid Cancer [J].

Ahmadi, Sara ;

Alexander, Erik K. .

ENDOCRINE PRACTICE, 2023, 29 (02) :148-153

[3] Diagnosis of thyroid nodules [J].

Alexander, Erik K. ;

Cibas, Edmund S. .

LANCET DIABETES & ENDOCRINOLOGY, 2022, 10 (07) :533-539

[4] Comparison Between ChatGPT and Google Search as Sources of Postoperative Patient Instructions [J].

Ayoub, Noel F. ;

Lee, Yu-Jin ;

Grimm, David ;

Balakrishnan, Karthik .

JAMA OTOLARYNGOLOGY-HEAD & NECK SURGERY, 2023, 149 (06) :556-+

[5] ChatGPT and the Future of Medical Writing [J].

Biswas, Som .

RADIOLOGY, 2023, 307 (02)

[6] Thyroid cancer [J].

Cabanillas, Maria E. ;

McFadden, David G. ;

Durante, Cosimo .

LANCET, 2016, 388 (10061) :2783-2795

[7] Performance of Generative Large Language Models on Ophthalmology Board-Style Questions [J].

Cai, Louis Z. ;

Shaheen, Abdulla ;

Jin, Andrew ;

Fukui, Riya ;

Yi, Jonathan S. ;

Yannuzzi, Nicolas ;

Alabiad, Chrisfouad .

AMERICAN JOURNAL OF OPHTHALMOLOGY, 2023, 254 :141-149

[8] Evaluating the Feasibility of ChatGPT in Healthcare: An Analysis of Multiple Clinical and Research Scenarios [J].

Cascella, Marco ;

Montomoli, Jonathan ;

Bellini, Valentina ;

Bignami, Elena .

JOURNAL OF MEDICAL SYSTEMS, 2023, 47 (01)

[9] Thyroid cancer [J].

Chen, Debbie W. ;

Lang, Brian H. H. ;

McLeod, Donald S. A. ;

Newbold, Kate ;

Haymart, Megan R. .

LANCET, 2023, 401 (10387) :1531-1544

[10] The Diagnosis and Management of Thyroid Nodules A Review [J].

Durante, Cosimo ;

Grani, Giorgio ;

Lamartina, Livia ;

Filetti, Sebastian ;

Mandel, Susan J. ;

Cooper, David S. .

JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2018, 319 (09) :914-924

← 1 2 3 →