Assessing the readability, reliability, and quality of artificial intelligence chatbot responses to the 100 most searched queries about cardiopulmonary resuscitation: An observational study

被引:2
作者
Arca, Dilek Omur [1 ,3 ]
Erdemir, Ismail [1 ]
Kara, Fevzi [1 ]
Shermatov, Nurgazy [1 ]
Odacioglu, Muruvvet [1 ]
Ibisoglu, Emel [1 ]
Hanci, Ferid Baran [2 ]
Sagiroglu, Gonul [1 ]
Hanci, Volkan [1 ]
机构
[1] Dokuz Eylul Univ, Sch Med, Dept Anesthesiol & Reanimat, Izmir, Turkiye
[2] Ostim Tech Univ, Dept Fac Engn, Artificial Intelligence Engn, Ankara, Turkiye
[3] Dokuz Eylul Univ, Sch Med, Dept Anesthesiol & Reanimat, 1606 15 Temmuz Yerleskesi, TR-35340 Balcova, Izmir, Turkiye
关键词
Artificial intelligence; Bard; chatbot; ChatGPT; Gemini; Perplexity; quality; readability; reliability; CARDIAC-ARREST; PATIENT EDUCATION; GUIDELINES;
D O I
10.1097/MD.0000000000038352
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
This study aimed to evaluate the readability, reliability, and quality of responses by 4 selected artificial intelligence (AI)-based large language model (LLM) chatbots to questions related to cardiopulmonary resuscitation (CPR). This was a cross-sectional study. Responses to the 100 most frequently asked questions about CPR by 4 selected chatbots (ChatGPT-3.5 [Open AI], Google Bard [Google AI], Google Gemini [Google AI], and Perplexity [Perplexity AI]) were analyzed for readability, reliability, and quality. The chatbots were asked the following question: "What are the 100 most frequently asked questions about cardio pulmonary resuscitation?" in English. Each of the 100 queries derived from the responses was individually posed to the 4 chatbots. The 400 responses or patient education materials (PEM) from the chatbots were assessed for quality and reliability using the modified DISCERN Questionnaire, Journal of the American Medical Association and Global Quality Score. Readability assessment utilized 2 different calculators, which computed readability scores independently using metrics such as Flesch Reading Ease Score, Flesch-Kincaid Grade Level, Simple Measure of Gobbledygook, Gunning Fog Readability and Automated Readability Index. Analyzed 100 responses from each of the 4 chatbots. When the readability values of the median results obtained from Calculators 1 and 2 were compared with the 6th-grade reading level, there was a highly significant difference between the groups (P < .001). Compared to all formulas, the readability level of the responses was above 6th grade. It can be seen that the order of readability from easy to difficult is Bard, Perplexity, Gemini, and ChatGPT-3.5. The readability of the text content provided by all 4 chatbots was found to be above the 6th-grade level. We believe that enhancing the quality, reliability, and readability of PEMs will lead to easier understanding by readers and more accurate performance of CPR. So, patients who receive bystander CPR may experience an increased likelihood of survival.
引用
收藏
页数:11
相关论文
共 46 条
[1]  
Aydin O., 2023, J AI, V7, P1, DOI DOI 10.61969/JAI.1311271
[2]   Readability of patient education materials from the American Academy of Orthopaedic Surgeons and Pediatric Orthopaedic Society of North America web sites [J].
Badarudeen, Sameer ;
Sabharwal, Sanjeev .
JOURNAL OF BONE AND JOINT SURGERY-AMERICAN VOLUME, 2008, 90A (01) :199-204
[3]   Both Patients and Plastic Surgeons Prefer Artificial Intelligence-Generated Microsurgical Information [J].
Berry, Charlotte E. ;
Fazilat, Alexander Z. ;
Lavin, Christopher ;
Lintel, Hendrik ;
Cole, Naomi ;
Stingl, Cybil S. ;
Valencia, Caleb ;
Morgan, Annah G. ;
Momeni, Arash ;
Wan, Derrick C. .
JOURNAL OF RECONSTRUCTIVE MICROSURGERY, 2024, 40 (09) :657-664
[4]   Readability of internet-sourced patient education material related to "labour analgesia" [J].
Boztas, Nilay ;
Omur, Dilek ;
Ozbilgin, Sule ;
Altuntas, Goezde ;
Piskin, Ersan ;
Ozkardesler, Sevda ;
Hanci, Volkan .
MEDICINE, 2017, 96 (45)
[5]  
Brandl R., ChatGPT Statistics 2023 All the latest statistics about OpenAI's chatbot
[6]  
Carl MM., 2024, Br J Ophthalmol, V0, P1
[7]  
Carl MM., 2024, Graefes Arch Clin Exp Ophthalmol, V4, P4
[8]   ChatGPT and Patient Information in Nuclear Medicine: GPT-3.5 Versus GPT-4 [J].
Currie, Geoff ;
Robbie, Stephanie ;
Tually, Peter .
JOURNAL OF NUCLEAR MEDICINE TECHNOLOGY, 2023, 51 (04) :307-313
[9]   The Temperature Feature of ChatGPT: Modifying Creativity for Clinical Research [J].
Davis, Joshua ;
Van Bulck, Liesbet ;
Durieux, Brigitte N. ;
Lindvall, Charlotta .
JMIR HUMAN FACTORS, 2024, 11
[10]  
Davis RJ., 2023, Laryngoscope, V00, P1