Performance of ChatGPT on Nursing Licensure Examinations in the United States and China: Cross-Sectional Study

被引:1
|
作者
Wu, Zelin [1 ,2 ]
Gan, Wenyi [3 ]
Xue, Zhaowen [1 ,2 ]
Ni, Zhengxin [4 ]
Zheng, Xiaofei [1 ,2 ]
Zhang, Yiyi [1 ,2 ]
机构
[1] First Affiliated Hosp, Dept Bone & Joint Surg, 613 Huangpu Ave West, Guangzhou 510630, Peoples R China
[2] First Affiliated Hosp, Sports Med Ctr, 613 Huangpu Ave West, Guangzhou 510630, Peoples R China
[3] Zhuhai Peoples Hosp, Dept Joint Surg & Sports Med, Zhuhai, Peoples R China
[4] Yangzhou Univ, Sch Nursing, Yangzhou, Peoples R China
来源
JMIR MEDICAL EDUCATION | 2024年 / 10卷
关键词
artificial intelligence; ChatGPT; nursing licensure examination; nursing; LLMs; large language models; nursing education; AI; nursing student; large language model; licensing; observation; observational study; China; USA; United States of America; auxiliary tool; accuracy rate; theoretical; EDUCATION;
D O I
10.2196/52746
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Background: The creation of large language models (LLMs) such as ChatGPT is an important step in the development of artificial intelligence, which shows great potential in medical education due to its powerful language understanding and generative capabilities. The purpose of this study was to quantitatively evaluate and comprehensively analyze ChatGPT's performance in handling questions for the National Nursing Licensure Examination (NNLE) in China and the United States, including the National Council Licensure Examination for Registered Nurses (NCLEX-RN) and the NNLE. Objective: This study aims to examine how well LLMs respond to the NCLEX-RN and the NNLE multiple-choice questions (MCQs) in various language inputs. To evaluate whether LLMs can be used as multilingual learning assistance for nursing, and to assess whether they possess a repository of professional knowledge applicable to clinical nursing practice. Methods: First, we compiled 150 NCLEX-RN Practical MCQs, 240 NNLE Theoretical MCQs, and 240 NNLE Practical MCQs. Then, the translation function of ChatGPT 3.5 was used to translate NCLEX-RN questions from English to Chinese and NNLE questions from Chinese to English. Finally, the original version and the translated version of the MCQs were inputted into ChatGPT 4.0, ChatGPT 3.5, and Google Bard. Different LLMs were compared according to the accuracy rate, and the differences between different language inputs were compared. Results: The accuracy rates of ChatGPT 4.0 for NCLEX-RN practical questions and Chinese-translated NCLEX-RN practical questions were 88.7% (133/150) and 79.3% (119/150), respectively. Despite the statistical significance of the difference (P=.03), the correct rate was generally satisfactory. Around 71.9% (169/235) of NNLE Theoretical MCQs and 69.1% (161/233) of NNLE Practical MCQs were correctly answered by ChatGPT 4.0. The accuracy of ChatGPT 4.0 in processing NNLE Theoretical MCQs and NNLE Practical MCQs translated into English was 71.5% (168/235; P=.92) and 67.8% (158/233; P=.77), respectively, and there was no statistically significant difference between the results of text input in different languages. ChatGPT 3.5 (NCLEX-RN P=.003, NNLE Theoretical P<.001, NNLE Practical P=.12) and Google Bard (NCLEX-RN P<.001, NNLE Theoretical P<.001, NNLE Practical P<.001) had lower accuracy rates for nursing-related MCQs than ChatGPT 4.0 in English input. English accuracy was higher when compared with ChatGPT 3.5's Chinese input, and the difference was statistically significant (NCLEX-RN P=.02, NNLE Practical P=.02). Whether submitted in Chinese or English, the MCQs from the NCLEX-RN and NNLE demonstrated that ChatGPT 4.0 had the highest number of unique correct responses and the lowest number of unique incorrect responses among the 3 LLMs. Conclusions: This study, focusing on 618 nursing MCQs including NCLEX-RN and NNLE exams, found that ChatGPT 4.0 outperformed ChatGPT 3.5 and Google Bard in accuracy. It excelled in processing English and Chinese inputs, underscoring its potential as a valuable tool in nursing education and clinical decision-making.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Prevalence and predisposing factors of health literacy of nursing students: A large cross-sectional study
    Akca, Aysegul
    Ayaz-Alkaya, Sultan
    JOURNAL OF PUBLIC HEALTH-HEIDELBERG, 2024,
  • [22] Academic performance, adaptation and mental health of nursing students: A cross-sectional study
    Silva, George Oliveira
    Aredes, Natalia Del Angelo
    Galdino-Junior, Helio
    NURSE EDUCATION IN PRACTICE, 2021, 55
  • [23] Characterizing the Adoption and Experiences of Users of Artificial Intelligence-Generated Health Information in the United States: Cross-Sectional Questionnaire Study
    Ayo-Ajibola, Oluwatobiloba
    Davis, Ryan J.
    Lin, Matthew E.
    Riddel, Jeffrey
    Kravitz, Richard L.
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [24] Performance of ChatGPT Compared to Clinical Practice Guidelines in Making Informed Decisions for Lumbosacral Radicular Pain: A Cross-sectional Study
    Gianola, Silvia
    Bargeri, Silvia
    Castellini, Greta
    Cook, Chad
    Palese, Alvisa
    Pillastrini, Paolo
    Salvalaggio, Silvia
    Turolla, Andrea
    Rossettini, Giacomo
    JOURNAL OF ORTHOPAEDIC & SPORTS PHYSICAL THERAPY, 2024, 54 (03) : 222 - 228
  • [25] Perception, concerns, and practice of ChatGPT among Egyptian pharmacists: a cross-sectional study in Egypt
    Taha, Taha Abd-ElSalam Ashraf
    Abdel-Qader, Derar H.
    Alamiry, Kareem R.
    Fadl, Zeyad A.
    Alrawi, Aya
    Abdelsattar, Nada K.
    BMC HEALTH SERVICES RESEARCH, 2024, 24 (01)
  • [26] Knowledge, Attitude, and Practices of General Population Toward Utilizing ChatGPT: A Cross-sectional Study
    Bodani, Nikita
    Lal, Abhishek
    Maqsood, Afsheen
    Altamash, Sara
    Ahmed, Naseer
    Heboyan, Artak
    SAGE OPEN, 2023, 13 (04):
  • [27] Personal values among undergraduate nursing students: A cross-sectional study
    Luciani, Michela
    Rampoldi, Giulia
    Ardenghi, Stefano
    Bani, Marco
    Merati, Sandra
    Ausili, Davide
    Strepparava, Maria Grazia
    Di Mauro, Stefania
    NURSING ETHICS, 2020, 27 (06) : 1461 - 1471
  • [28] Nursing Students' Attitudes Toward Technology: Multicenter Cross-Sectional Study
    Dallora, Ana Luiza
    Andersson, Ewa Kazimiera
    Palm, Bruna Gregory
    Bohman, Doris
    Bjorling, Gunilla
    Marcinowicz, Ludmila
    Stjernberg, Louise
    Anderberg, Peter
    JMIR MEDICAL EDUCATION, 2024, 10
  • [29] Empathy levels among nursing students: A comparative cross-sectional study
    Eklund, Jakob Hakansson
    Holmstrom, Inger K.
    Lindqvist, Anna Ollen
    Sundler, Annelle J.
    Hochwalder, Jacek
    Hammer, Lena Marmstal
    NURSING OPEN, 2019, 6 (03): : 983 - 989
  • [30] Academic stress and active learning of nursing students: A cross-sectional study
    Magnavita, Nicola
    Chiorri, Carlo
    NURSE EDUCATION TODAY, 2018, 68 : 128 - 133