AI-Generated Information for Vascular Patients: Assessing the Standard of Procedure-Specific Information Provided by the ChatGPT AI-Language Model

被引:11
作者
Haidar, Omar [1 ]
Jaques, Alexander [1 ]
McCaughran, Pierre W. [1 ]
Metcalfe, Matthew J. [1 ]
机构
[1] Lister Hosp, Vasc Surg, Stevenage, England
关键词
chatgpt; patient education; ai; artificial intelligence; vascular; INTERNET; SURGERY; QUALITY;
D O I
10.7759/cureus.49764
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
IntroductionEnsuring access to high-quality information is paramount to facilitating informed surgical decision-making. The use of the internet to access health-related information is increasing, along with the growing prevalence of AI language models such as ChatGPT. We aim to assess the standard of AI-generated patient-facing information through a qualitative analysis of its readability and quality.Materials and methods We performed a retrospective qualitative analysis of information regarding three common vascular procedures: endovascular aortic repair (EVAR), endovenous laser ablation (EVLA), and femoro-popliteal bypass (FPBP). The ChatGPT responses were compared to patient information leaflets provided by the vascular charity, Circulation Foundation UK. Readability was assessed using four readability scores: the Flesch-Kincaid reading ease (FKRE) score, the Flesch-Kincaid grade level (FKGL), the Gunning fog score (GFS), and the simple measure of gobbledygook (SMOG) index. Quality was assessed using the DISCERN tool by two independent assessors. ResultsThe mean FKRE score was 33.3, compared to 59.1 for the information provided by the Circulation Foundation (SD=14.5, p=0.025) indicating poor readability of AI-generated information. The FFKGL indicated that the expected grade of students likely to read and understand ChatGPT responses was consistently higher than compared to information leaflets at 12.7 vs. 9.4 (SD=1.9, p=0.002). Two metrics measure readability in terms of the number of years of education required to understand a piece of writing: the GFS and SMOG. Both scores indicated that AI-generated answers were less accessible. The GFS for ChatGPT-provided information was 16.7 years versus 12.8 years for the leaflets (SD=2.2, p=0.002) and the SMOG index scores were 12.2 and 9.4 years for ChatGPT and the patient information leaflets, respectively (SD=1.7, p=0.001). The DISCERN scores were consistently higher in human-generated patient information leaflets compared to AI-generated information across all procedures; the mean score for the information provided by ChatGPT was 50.3 vs. 56.0 for the Circulation Foundation information leaflets (SD=3.38, p<0.001).ConclusionWe concluded that AI-generated information about vascular surgical procedures is currently poor in both the readability of text and the quality of information. Patients should be directed to reputable, human-generated information sources from trusted professional bodies to supplement direct education from the clinician during the pre-procedure consultation process.
引用
收藏
页数:10
相关论文
共 21 条
[1]  
[Anonymous], 2023, EVLA Treatment for Varicose Veins-Endovenous Laser Ablation Therapy
[2]  
[Anonymous], 2023, New 100 million fund to capitalise on AI's game-changing potential in life sciences and healthcare
[3]  
[Anonymous], 2023, Introducing ChatGPT
[4]  
[Anonymous], 2023, Readability Test-WebFX
[5]  
[Anonymous], 2023, About us
[6]  
[Anonymous], 2023, Gpt-4
[7]  
[Anonymous], 2023, Good Surgical Practice-Royal College of Surgeons
[8]  
[Anonymous], 2015, MONTG APP LAN HLTH B
[9]  
[Anonymous], 2023, Femoropopliteal & Femorodistal Bypass | Problems with Leg Circulation
[10]   Machine intelligence in healthcare-perspectives on trustworthiness, explainability, usability, and transparency [J].
Cutillo, Christine M. ;
Sharma, Karlie R. ;
Foschini, Luca ;
Kundu, Shinjini ;
Mackintosh, Maxine ;
Mandl, Kenneth D. ;
Beck, Tyler ;
Collier, Elaine ;
Colvis, Christine ;
Gersing, Kenneth ;
Gordon, Valery ;
Jensen, Roxanne ;
Shabestari, Behrouz ;
Southall, Noel .
NPJ DIGITAL MEDICINE, 2020, 3 (01)