Assessing the accuracy and efficiency of Chat GPT-4 Omni (GPT-4o) in biomedical statistics Comparative study with traditional tools

被引：2

作者：

Meo, Anusha S. ^{[1
]}

Shaikh, Narmeen ^{[2
]}

Meo, Sultan A. ^{[3
]}

机构：

[1] Univ Aberdeen, Sch Med, Med Sci & Nutr, Aberdeen, Scotland

[2] King Saud Univ, Coll Med, Riyadh, Saudi Arabia

[3] King Saud Univ, Coll Med, Dept Physiol, Riyadh, Saudi Arabia

来源：

SAUDI MEDICAL JOURNAL | 2024年 / 45卷 / 12期

关键词：

Artificial Intelligence; GPT-4; Omni; medical statistics; statistical analysis;

D O I：

10.15537/smj.2024.45.12.20240454

中图分类号：

R5 [内科学];

学科分类号：

1002 ; 100201 ;

摘要：

Objectives: To assess the accuracy of ChatGPT-4 Omni (GPT-4o) in biomedical statistics. The recent novel inauguration of Artificial Intelligence ChatGPT-Omni (GPT-4o), has emerged with the potential to analyze sophisticated and extensive data sets, challenging the expertise of statisticians using traditional statistical tools for data analysis. Methods: This study was performed in the Department of Physiology, College of Medicine, King Saud University, Riyadh, Saudi Arabia, in May 2024. Three datasets in a raw Excel file format were imported onto Statistical Package for the Social Sciences (SPSS) version 29 for data analysis. Based on this analysis, a script of 9 questions was prepared to command GPT-4 Omni, which was used for data analysis for all 3 datasets on Omni. The score and the time were recorded for each result and verified after being compared to the original analysis results performed on SPSS. Results: GPT-4 Omni scored 73 (85.88%) out of 85 points for all 3 datasets. All datasets took a total of 38.43 minutes to be fully analyzed. Individually, Omni scored 21/25 (84%) for the small dataset in 487.4 seconds, 20/25 (80%) for the middle dataset in 747.02 seconds and 32/35 (91.42%) for the large dataset in 1071 seconds. GPT-4 Omni produced accurate graphs and charts. Conclusion: ChatGPT-4 Omni scored better over 80% in all 3 statistical datasets in a short period. GPT-4 Omni also produced accurate graphs and charts as commanded however it required explicit commands with clear instructions to avoid errors and omission of results to achieve appropriate results in biomedical data analysis.

引用

页码：1383 / 1390

页数：8

共 18 条

[1]

[Anonymous], 23. Psychosocial Predictors of Diet and Acculturation in Chinese American and Chinese Canadian Women: Ethnicity Health: Vol 7, No 1 [Internet]. [cited 2024 May 31]. Available from: https://www.tandfonline.com/doi/abs/10.1080/13557850220146975, DOI DOI 10.1080/03585522.2024.2314304

[2]

help.openai, GPT FAQs

[3]

Huang JS, 2023, AM J CANCER RES, V13, P1148

[4] Efficacy and limitations of ChatGPT as a biostatistical problem-solving tool in medical education: a descriptive study [J].

Ignjatovic, Aleksandra ;

Stevanovic, Lazar .

JOURNAL OF EDUCATIONAL EVALUATION FOR HEALTH PROFESSIONS, 2023, 20

[5]

kaggle.com, 2024, Sleep, Health, Lifestyle

[6]

kaggle.com, Fertility Datasets

[7]

Maternal Health, Kaggle

[8] Roleofartificialintelligence(Googlebard)inmorphological, histopathological, and radiological image identifications Objective Structured Practical Examination (OSPE) type-based performance [J].

Meo, Sultan A. ;

Abukhalaf, Abdulelah A. ;

Meo, Muhammad Zain S. ;

Meo, Muhammad Omair S. ;

Ayub, Rashid ;

Eltoukhy, Riham A. ;

Usmani, Adnan M. ;

Hajjar, Waseem M. .

SAUDI MEDICAL JOURNAL, 2024, 45 (05) :531-536

[9] Medical knowledge of ChatGPT in public health, infectious diseases, COVID-19 pandemic, and vaccines: multiple choice questions examination based performance [J].

Meo, Sultan Ayoub ;

Alotaibi, Metib ;

Meo, Muhammad Zain Sultan ;

Meo, Muhammad Omair Sultan ;

Hamid, Mashhood .

FRONTIERS IN PUBLIC HEALTH, 2024, 12

[10] ChatGPT Knowledge Evaluation in Basic and Clinical Medical Sciences: Multiple Choice Question Examination-Based Performance [J].

Meo, Sultan Ayoub ;

Al-Masri, Abeer A. ;

Alotaibi, Metib ;

Meo, Muhammad Zain Sultan ;

Meo, Muhammad Omair Sultan .

HEALTHCARE, 2023, 11 (14)

← 1 2 →