Assessing the Performance of Chat Generative Pretrained Transformer (ChatGPT) in Answering Andrology-Related Questions

被引：16

作者：

Caglar, Ufuk ^{[1
]}

Yildiz, Oguzhan ^{[1
]}

Ozervarli, M. Firat ^{[2
]}

Aydin, Resat ^{[2
]}

Sarilar, Omer ^{[1
]}

Ozgor, Faruk ^{[1
]}

Ortac, Mazhar ^{[2
]}

机构：

[1] Haseki Training & Res Hosp, Dept Urol, Istanbul, Turkiye

[2] Istanbul Univ, Istanbul Sch Med, Dept Urol, Istanbul, Turkiye

来源：

UROLOGY RESEARCH AND PRACTICE | 2023年 / 49卷 / 06期

关键词：

Andrology; artificial intelligence; information sources;

D O I：

10.5152/tud.2023.23171

中图分类号：

R5 [内科学]; R69 [泌尿科学（泌尿生殖系疾病）];

学科分类号：

1002 ; 100201 ;

摘要：

Objective: The internet and social media have become primary sources of health information, with men frequently turning to these platforms before seeking professional help. Chat generative pretrained transformer (ChatGPT), an artificial intelligence model developed by OpenAI, has gained popularity as a natural language processing program. The present study evaluated the accuracy and reproducibility of ChatGPT's responses to andrology-related questions. Methods: The study analyzed frequently asked andrology questions from health forums, hospital websites, and social media platforms like YouTube and Instagram. Questions were categorized into topics like male hypogonadism, erectile dysfunction, etc. The European Association of Urology (EAU) guideline recommendations were also included. These questions were input into ChatGPT, and responses were evaluated by 3 experienced urologists who scored them on a scale of 1 to 4. Results: Out of 136 evaluated questions, 108 met the criteria. Of these, 87.9% received correct and adequate answers, 9.3% were correct but insufficient, and 3 responses contained both correct and incorrect information. No question was answered completely wrong. The highest correct answer rates were for disorders of ejaculation, penile curvature, and male hypogonadism. The EAU guideline-based questions achieved a correctness rate of 86.3%. The reproducibility of the answers was over 90%. Conclusion: The study found that ChatGPT provided accurate and reliable answers to over 80% of andrology-related questions. While limitations exist, such as potential outdated data and inability to understand emotional aspects, ChatGPT's potential in the health-care sector is promising. Collaborating with health-care professionals during artificial intelligence model development could enhance its reliability.

引用

页码：365 / 369

页数：92

共 15 条

[1]

Ali R., 2023, Neurosurgery

[2] How Does ChatGPT Perform on the Italian Residency Admission National Exam Compared to 15,869 Medical Graduates? [J].

Bonetti, Mario Alessandri ;

Giorgino, Riccardo ;

Afflitto, Gabriele Gallo ;

De Lorenzi, Francesca ;

Egro, Francesco M. .

ANNALS OF BIOMEDICAL ENGINEERING, 2024, 52 (04) :745-749

[3] Artificial Intelligence and Public Health: Evaluating ChatGPT Responses to Vaccination Myths and Misconceptions [J].

Deiana, Giovanna ;

Dettori, Marco ;

Arghittu, Antonella ;

Azara, Antonio ;

Gabutti, Giovanni ;

Castiglia, Paolo .

VACCINES, 2023, 11 (07)

[4] The broad reach and inaccuracy of men's health information on social media: analysis of TikTok and Instagram [J].

Dubin, Justin M. ;

Aguiar, Jonathan A. ;

Lin, Jasmine S. ;

Greenberg, Daniel R. ;

Keeter, Mary Kate ;

Fantus, Richard J. ;

Pham, Minh N. ;

Hudnall, Matthew T. ;

Bennett, Nelson E. ;

Brannigan, Robert E. ;

Halpern, Joshua A. .

INTERNATIONAL JOURNAL OF IMPOTENCE RESEARCH, 2024, 36 (03) :256-260

[5]

EAU Guidelines, 2023, EAU ANN C MIL

[6] Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT [J].

Lecler, Augustin ;

Duron, Loic ;

Soyer, Philippe .

DIAGNOSTIC AND INTERVENTIONAL IMAGING, 2023, 104 (06) :269-274

[7] ChatGPT Answers Common Patient Questions About Colonoscopy [J].

Lee, Tsung-chun ;

Staller, Kyle ;

Botoman, Vlaicu ;

Pathipati, Mythili P. ;

Varma, Sanskriti ;

Kuo, Braden .

GASTROENTEROLOGY, 2023, 165 (02) :509-+

[8]

Openai.com, 2023, Chatgpt: Optimising language models for dialogue

[9] Success of ChatGPT, an AI language model, in taking the French language version of the European Board of Ophthalmology examination: A novel approach to medical knowledge assessment [J].

Panthier, C. ;

Gatinel, D. .

JOURNAL FRANCAIS D OPHTALMOLOGIE, 2023, 46 (07) :706-711

[10]

Pollard J, 2007, Working with Men via the Internet, Hazardous Waist Tackling Male Weight Problems, P186

← 1 2 →