ChatGPT Earns American Board Certification in Hand Surgery

被引:15
作者
Ghanem, Diane [1 ]
Nassar, Joseph E.
El Bachour, Joseph [2 ]
Hanna, Tammam [3 ]
机构
[1] Johns Hopkins Univ Hosp, Dept Orthopaed Surg, Baltimore, MD 21287 USA
[2] Amer Univ Beirut, Fac Med, Beirut, Lebanon
[3] Texas Tech Univ, Hlth Sci Ctr, Dept Orthopaed Surg & Rehabil, Lubbock, TX USA
关键词
Artificial intelligence; Language model; ChatGPT; Orthopaedic surgery; Hand surgery; Hand board examination;
D O I
10.1016/j.hansur.2024.101688
中图分类号
R826.8 [整形外科学]; R782.2 [口腔颌面部整形外科学]; R726.2 [小儿整形外科学]; R62 [整形外科学(修复外科学)];
学科分类号
摘要
Purpose: Artificial Intelligence (AI), and specifically ChatGPT, has shown potential in healthcare, yet its performance in specialized medical examinations such as the Orthopaedic Surgery In -Training Examination and European Board Hand Surgery diploma has been inconsistent. This study aims to evaluate the capability of ChatGPT-4 to pass the American Hand Surgery Certifying Examination. Methods: ChatGPT-4 was tested on the 2019 American Society for Surgery of the Hand (ASSH) SelfAssessment Exam. All 200 questions available online (https://onlinecme.assh.org) were retrieved. All media-containing questions were flagged and carefully reviewed. Eight media-containing questions were excluded as they either relied purely on videos or could not be rationalized from the presented information. Descriptive statistics were used to summarize the performance (% correct) of ChatGPT-4. The ASSH report was used to compare ChatGPT-4's performance to that of the 322 physicians who completed the 2019 ASSH self-assessment. Results: ChatGPT-4 answered 192 questions with an overall score of 61.98%. Performance on mediacontaining questions was 55.56%, while on non -media questions it was 65.83%, with no statistical difference in performance based on media inclusion. Despite scoring below the average physician's performance, ChatGPT-4 outperformed in the 'vascular' section with 81.82%. Its performance was lower in the 'bone and joint' (48.54%) and 'neuromuscular' (56.25%) sections. Conclusions: ChatGPT-4 achieved a good overall score of 61.98%. This AI language model demonstrates significant capability in processing and answering specialized medical examination questions, albeit with room for improvement in areas requiring complex clinical judgment and nuanced interpretation. ChatGPT-4's proficiency is influenced by the structure and language of the examination, with no replacement for the depth of trained medical specialists. This study underscores the supportive role of AI in medical education and clinical decision-making while highlighting the current limitations in nuanced fields such as hand surgery. (c) 2024 SFCM. Published by Elsevier Masson SAS. All rights reserved.
引用
收藏
页数:6
相关论文
共 15 条
[1]   A Review of the Role of Artificial Intelligence in Healthcare [J].
Al Kuwaiti, Ahmed ;
Nazer, Khalid ;
Al-Reedy, Abdullah ;
Al-Shehri, Shaher ;
Al-Muhanna, Afnan ;
Subbarayalu, Arun Vijay ;
Al Muhanna, Dhoha ;
Al-Muhanna, Fahad A. .
JOURNAL OF PERSONALIZED MEDICINE, 2023, 13 (06)
[2]   Evaluating the Feasibility of ChatGPT in Healthcare: An Analysis of Multiple Clinical and Research Scenarios [J].
Cascella, Marco ;
Montomoli, Jonathan ;
Bellini, Valentina ;
Bignami, Elena .
JOURNAL OF MEDICAL SYSTEMS, 2023, 47 (01)
[3]   ChatGPT Performs at the Level of a Third-Year Orthopaedic Surgery Resident on the Orthopaedic In-Training Examination [J].
Ghanem, Diane ;
Covarrubias, Oscar ;
Raad, Micheal ;
LaPorte, Dawn ;
Shafiq, Babar .
JBJS OPEN ACCESS, 2023, 8 (04)
[4]   How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment [J].
Gilson, Aidan ;
Safranek, Conrad W. ;
Huang, Thomas ;
Socrates, Vimig ;
Chi, Ling ;
Taylor, Richard Andrew ;
Chartash, David .
JMIR MEDICAL EDUCATION, 2023, 9
[5]   Evaluating ChatGPT Performance on the Orthopaedic In-Training Examination [J].
Kung, Justin E. ;
Marshall, Christopher ;
Gauthier, Chase ;
Gonzalez, Tyler A. ;
Jackson III, J. Benjamin .
JBJS OPEN ACCESS, 2023, 8 (03)
[6]   Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models [J].
Kung, Tiffany H. ;
Cheatham, Morgan ;
Medenilla, Arielle ;
Sillos, Czarina ;
De Leon, Lorie ;
Elepano, Camille ;
Madriaga, Maria ;
Aggabao, Rimel ;
Diaz-Candido, Giezel ;
Maningo, James ;
Tseng, Victor .
PLOS DIGITAL HEALTH, 2023, 2 (02)
[7]   Orthopaedic In-Training Examination: History, Perspective, and Tips for Residents [J].
Le, Hai, V ;
Wick, Joseph B. ;
Haus, Brian M. ;
Dyer, George S. M. .
JOURNAL OF THE AMERICAN ACADEMY OF ORTHOPAEDIC SURGEONS, 2021, 29 (09) :E427-E437
[8]   Can Artificial Intelligence Pass the American Board of Orthopaedic Surgery Examination? Orthopaedic Residents Versus ChatGPT [J].
Lum, Zachary C. .
CLINICAL ORTHOPAEDICS AND RELATED RESEARCH, 2023, 481 (08) :1623-1630
[9]  
OpenAI, GPT-4
[10]   ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns [J].
Sallam, Malik .
HEALTHCARE, 2023, 11 (06)