Evaluation of ChatGPT's Performance in the Turkish Board of Orthopaedic Surgery Examination

被引:0
|
作者
Yigitbay, Ahmet [1 ]
机构
[1] Siverek State Hosp, Clin Orthoped & Traumatol, Sanliurfa, Turkiye
来源
HASEKI TIP BULTENI-MEDICAL BULLETIN OF HASEKI | 2024年 / 62卷 / 04期
关键词
Artificial intelligence; humans; orthopedics; specialty boards; ARTIFICIAL-INTELLIGENCE;
D O I
10.4274/haseki.galenos.2024.10038
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Aim: Technological advances lead to significant changes in education and evaluation processes in medicine. In particular, artificial intelligence and natural language processing developments offer new opportunities in the health sector. This article evaluates Chat Generative Pre-Trained Transformer's (ChatGPT) performance in the Turkish Orthopaedics and Traumatology Education Council (TOTEK) Qualifying Written Examination and its applicability. Methods: To evaluate ChatGPT's performance, TOTEK Qualifying Written Examination questions from the last five years were entered as data. The results of ChatGPT were assessed under four parameters and compared with the actual exam results. The results were analyzed statistically. Results: Of the 500 questions, 458 were used as data in this study. Chat Generative Pre-Trained Transformer scored 40.2%, 26.3%, 37.3%, 32.9%, and 35.8% in the 2019, 2020, 2021, 2022, and 2023 TOTEK Qualifying Written Examination, respectively. When the correct answer percentages of ChatGPT according to years and the simple linear regression model applied to these data were analyzed, it was determined that there was a slightly decreasing trend in the correct answer rates as the years progressed. ChatGPT's TOTEK Qualifying Written Examination performance showed a statistically significant difference from the actual exam results. It was observed that the correct answer percentage of ChatGPT was below the general average success scores of the exam for each year. Conclusions: This analysis of artificial intelligence's applicability in the field and its role in training processes is essential to assess ChatGPT's potential uses and limitations. Chat Generative Pre-Trained Transformer can be a training tool, especially for knowledgebased and logical questions on specific topics. Still, its current performance is not at a level that can replace human decision-making in specialized medical fields.
引用
收藏
页码:243 / 249
页数:7
相关论文
共 50 条
  • [21] Evaluating Performance of ChatGPT on MKSAP Cardiology Board Review Questions
    Milutinovic, Stefan
    Petrovic, Marija
    Begosh-Mayne, Dustin
    Lopez-Mattei, Juan
    Chazal, Richard A.
    Wood, Malissa J.
    Escarcega, Ricardo O.
    INTERNATIONAL JOURNAL OF CARDIOLOGY, 2024, 417
  • [22] Could ChatGPT-4 pass an anaesthesiology board examination? Follow-up assessment of a comprehensive set of board examination practice questions
    Shay, Denys
    Kumar, Bhawesh
    Redaelli, Simone
    von Wedel, Dario
    Liu, Manqing
    Dershwitz, Mark
    Schaefer, Maximilian S.
    Beam, Andrew
    BRITISH JOURNAL OF ANAESTHESIA, 2024, 132 (01) : 172 - 174
  • [23] Performance of ChatGPT in Board Examinations for Specialists in the Japanese Ophthalmology Society
    Sakai, Daiki
    Maeda, Tadao
    Ozaki, Atsuta
    Kanda, Genki N.
    Kurimoto, Yasuo
    Takahashi, Masayo
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (12)
  • [24] Artificial fi cial Intelligence in Orthopaedics: Performance of ChatGPT on Text and Image Questions on a Complete AAOS Orthopaedic In-Training Examination (OITE)
    Hayes, Daniel S.
    Foster, Brian K.
    Makar, Gabriel
    Manzar, Shahid
    Ozdag, Yagiz
    Shultz, Mason
    Klena, Joel C.
    Grandizio, Louis C.
    JOURNAL OF SURGICAL EDUCATION, 2024, 81 (11) : 1645 - 1649
  • [25] Comparison of the problem-solving performance of ChatGPT-3.5, ChatGPT-4, Bing Chat, and Bard for the Korean emergency medicine board examination question bank
    Lee, Go Un
    Hong, Dae Young
    Kim, Sin Young
    Kim, Jong Won
    Lee, Young Hwan
    Park, Sang O.
    Lee, Kyeong Ryong
    MEDICINE, 2024, 103 (09) : E37325
  • [26] ChatGPT und die deutsche Facharztprüfung für Augenheilkunde: eine EvaluierungChatGPT and the German board examination for ophthalmology: an evaluation
    Rémi Yaïci
    M. Cieplucha
    R. Bock
    F. Moayed
    N. E. Bechrakis
    P. Berens
    N. Feltgen
    D. Friedburg
    M. Gräf
    R. Guthoff
    E. M. Hoffmann
    H. Hoerauf
    C. Hintschich
    T. Kohnen
    E. M. Messmer
    M. M. Nentwich
    U. Pleyer
    U. Schaudig
    B. Seitz
    G. Geerling
    M. Roth
    Die Ophthalmologie, 2024, 121 (7) : 554 - 564
  • [27] Benchmarking LLM chatbots' oncological knowledge with the Turkish Society of Medical Oncology's annual board examination questions
    Erdat, Efe Cem
    Kavak, Engin Eren
    BMC CANCER, 2025, 25 (01)
  • [28] Performance of ChatGPT-3.5 and ChatGPT-4o in the Japanese National Dental Examination
    Uehara, Osamu
    Morikawa, Tetsuro
    Harada, Fumiya
    Sugiyama, Nodoka
    Matsuki, Yuko
    Hiraki, Daichi
    Sakurai, Hinako
    Kado, Takashi
    Yoshida, Koki
    Murata, Yukie
    Matsuoka, Hirofumi
    Nagasawa, Toshiyuki
    Furuichi, Yasushi
    Abiko, Yoshihiro
    Miura, Hiroko
    JOURNAL OF DENTAL EDUCATION, 2024,
  • [29] Performance of ChatGPT and GPT-4 on Neurosurgery Written Board Examinations
    Ali, Rohaid
    Tang, Oliver Y.
    Connolly, Ian D.
    Sullivan, Patricia L. Zadnik
    Shin, John H.
    Fridley, Jared S.
    Asaad, Wael F.
    Cielo, Deus
    Oyelese, Adetokunbo A.
    Doberstein, Curtis E.
    Gokaslan, Ziya L.
    Telfeian, Albert E.
    NEUROSURGERY, 2023, 93 (06) : 1353 - 1365
  • [30] Performance of ChatGPT on a free-response anaesthesia primary examination
    Cai, Steven C.
    Tung, Alpha M. S.
    Eslick, Adam T.
    BRITISH JOURNAL OF ANAESTHESIA, 2024, 133 (01) : 219 - 221