Assessing the accuracy of ChatGPT references in head and neck and ENT disciplines

被引：28

作者：

Frosolini, Andrea ^{[1
]}

Franz, Leonardo ^{[2
,3
]}

Benedetti, Simone ^{[1
]}

Vaira, Luigi Angelo ^{[4
,5
]}

de Filippis, Cosimo ^{[2
]}

Gennaro, Paolo ^{[1
]}

Marioni, Gino ^{[2
]}

Gabriele, Guido ^{[1
]}

机构：

[1] Univ Siena, Dept Maxillofacial Surg, Policlin Le Scotte, Siena, Italy

[2] Univ Padua, Dept Neurosci DNS, Phoniatris & Audiol Unit, Treviso, Italy

[3] Univ Brescia, Dept Clin & Expt Sci, Artificial Intelligence Med & Innovat Clin Res & M, Brescia, Italy

[4] Univ Sassari, Dept Med Surg & Pharm, Maxillofacial Surg Operat Unit, Sassari, Italy

[5] Univ Sassari, PhD Sch Biomed Sci, Dept Biomed Sci, Sassari, Italy

来源：

EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY | 2023年 / 280卷 / 11期

关键词：

Head and neck surgery; Maxillofacial; AI; Chat-GPT; Artificial intelligence;

D O I：

10.1007/s00405-023-08205-4

中图分类号：

R76 [耳鼻咽喉科学];

学科分类号：

100213 ;

摘要：

PurposeChatGPT has gained popularity as a web application since its release in 2022. While artificial intelligence (AI) systems' potential in scientific writing is widely discussed, their reliability in reviewing literature and providing accurate references remains unexplored. This study examines the reliability of references generated by ChatGPT language models in the Head and Neck field.MethodsTwenty clinical questions were generated across different Head and Neck disciplines, to prompt ChatGPT versions 3.5 and 4.0 to produce texts on the assigned topics. The generated references were categorized as "true," "erroneous," or "inexistent" based on congruence with existing records in scientific databases.ResultsChatGPT 4.0 outperformed version 3.5 in terms of reference reliability. However, both versions displayed a tendency to provide erroneous/non-existent references.ConclusionsIt is crucial to address this challenge to maintain the reliability of scientific literature. Journals and institutions should establish strategies and good-practice principles in the evolving landscape of AI-assisted scientific writing.

引用

页码：5129 / 5133

页数：5

共 50 条

[31] Assessing the accuracy of ChatGPT in interpreting blood gas analysis results ChatGPT-4 in blood gas analysis
Turan, Engin Ihsan
Baydemir, Abdurrahman Engin
Balitatli, Anil Berkay
Sahin, Ayca Sultan
JOURNAL OF CLINICAL ANESTHESIA, 2025, 102
[32] ChatGPT performance in laryngology and head and neck surgery: a clinical case-series
Jerome R. Lechien
Bianca M. Georgescu
Stephane Hans
Carlos M. Chiesa-Estomba
European Archives of Oto-Rhino-Laryngology, 2024, 281 : 319 - 333
[33] ChatGPT performance in laryngology and head and neck surgery: a clinical case-series
Lechien, Jerome R.
Georgescu, Bianca M.
Hans, Stephane
Chiesa-Estomba, Carlos M.
EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2024, 281 (01) : 319 - 333
[34] Assessing the accuracy and reliability of ChatGPT's medical responses about thyroid cancer
Helvaci, Burcak Cavnar
Hepsen, Sema
Candemir, Burcu
Boz, Ogulcan
Durantas, Halil
Houssein, Mehdi
Cakal, Erman
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2024, 191
[35] ChatGPT vs UpToDate: comparative study of usefulness and reliability of Chatbot in common clinical presentations of otorhinolaryngology–head and neck surgery
Ziya Karimov
Irshad Allahverdiyev
Ozlem Yagiz Agayarov
Dogukan Demir
Elvina Almuradova
European Archives of Oto-Rhino-Laryngology, 2024, 281 : 2145 - 2151
[36] Gender inequities in ENT: Insights from women speakers at American Head and Neck Society meetings
McKeon, Mallory
Zhou, Anna
Tang, Alice L.
HEAD AND NECK-JOURNAL FOR THE SCIENCES AND SPECIALTIES OF THE HEAD AND NECK, 2024, 46 (06): : 1406 - 1416
[37] Assessing the use of the novel tool Claude 3 in comparison to ChatGPT 4.0 as an artificial intelligence tool in the diagnosis and therapy of primary head and neck cancer cases
Schmidl, Benedikt
Huetten, Tobias
Pigorsch, Steffi
Stoegbauer, Fabian
Hoch, Cosima C.
Hussain, Timon
Wollenberg, Barbara
Wirth, Markus
EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2024, 281 (11) : 6099 - 6109
[38] Assessing ChatGPT's theoretical knowledge and prescriptive accuracy in bacterial infections: a comparative study with infectious diseases residents and specialists
De Vito, Andrea
Geremia, Nicholas
Marino, Andrea
Bavaro, Davide Fiore
Caruana, Giorgia
Meschiari, Marianna
Colpani, Agnese
Mazzitelli, Maria
Scaglione, Vincenzo
Venanzi Rullo, Emmanuele
Fiore, Vito
Fois, Marco
Campanella, Edoardo
Pistara, Eugenia
Faltoni, Matteo
Nunnari, Giuseppe
Cattelan, Annamaria
Mussini, Cristina
Bartoletti, Michele
Vaira, Luigi Angelo
Madeddu, Giordano
INFECTION, 2024, : 873 - 881
[39] Assessing accuracy of ChatGPT in response to questions from day to day pharmaceutical care in hospitals
van Nuland, Merel
Lobbezoo, Anne-Fleur H.
van de Garde, Ewoudt M. W.
Herbrink, Maikel
van Heijl, Inger
Bognar, Tim
Houwen, Jeroen P. A.
Dekens, Marloes
Wannet, Demi
Egberts, Toine
van der Linden, Paul D.
EXPLORATORY RESEARCH IN CLINICAL AND SOCIAL PHARMACY, 2024, 15
[40] Assessing ChatGPT's accuracy and reliability in asthma general knowledge: implications for artificial intelligence use in public health education
Ghozali, Muhammad Thesa
JOURNAL OF ASTHMA, 2025,

← 1 2 3 4 5 →