Bridging the Gap Between Urological Research and Patient Understanding: The Role of Large Language Models in Automated Generation of Layperson's Summaries

被引：45

作者：

Eppler, Michael B. ^{[1
,2
,3
]}

Ganjavi, Conner ^{[1
,2
,3
]}

Knudsen, J. Everett ^{[1
,2
,3
]}

Davis, Ryan J. ^{[1
,2
,3
]}

Ayo-Ajibola, Oluwatobiloba ^{[1
,2
,3
]}

Desai, Aditya ^{[1
,2
,3
]}

Ramacciotti, Lorenzo Storino ^{[2
]}

Chen, Andrew ^{[1
,2
,3
]}

Abreu, Andre De Castro ^{[1
,2
,3
]}

Desai, Mihir M. ^{[1
,2
,3
]}

Gill, Inderbir S. ^{[1
,2
,3
]}

Cacciamani, Giovanni E. ^{[1
,2
,3
]}

机构：

[1] Univ Southern Calif, USC Inst Urol, Keck Sch Med, Los Angeles, CA 90007 USA

[2] Univ Southern Calif, Keck Sch Med, Catherine & Joseph Aresty Dept Urol, Los Angeles, CA 90007 USA

[3] Univ Southern Calif, USC Inst Urol, Artificial Intelligence Ctr, USC Urol, Los Angeles, CA USA

来源：

UROLOGY PRACTICE | 2023年 / 10卷 / 05期

关键词：

urology; artificial intelligence; patient education as topic; comprehension; communication; READABILITY; LITERACY;

D O I：

10.1097/UPJ.0000000000000428

中图分类号：

R5 [内科学]; R69 [泌尿科学（泌尿生殖系疾病）];

学科分类号：

1002 ; 100201 ;

摘要：

Introduction: This study assessed ChatGPT's ability to generate readable, accurate, and clear layperson summaries of urological studies, and compared the performance of ChatGPT-generated summaries with original abstracts and author-written patient summaries to determine its effectiveness as a potential solution for creating accessible medical literature for the public.Methods: Articles from the top 5 ranked urology journals were selected. A ChatGPT prompt was developed following guidelines to maximize readability, accuracy, and clarity, minimizing variability. Readability scores and grade-level indicators were calculated for the ChatGPT summaries, original abstracts, and patient summaries. Two MD physicians independently rated the accuracy and clarity of the ChatGPT-generated layperson summaries. Statistical analyses were conducted to compare readability scores. Cohen's K coefficient was used to assess interrater reliability for correctness and clarity evaluations. Results: A total of 256 journal articles were included. The ChatGPT-generated summaries were created with an average time of 17.5 (SD 15.0) seconds. The readability scores of the ChatGPTgenerated summaries were significantly better than the original abstracts, with Global Readability Score 54.8 (12.3) vs 29.8 (18.5), Flesch Kincade Reading Ease 54.8 (12.3) vs 29.8 (18.5), Flesch Kincaid Grade Level 10.4 (2.2) vs 13.5 (4.0), Gunning Fog Score 12.9 (2.6) vs 16.6 (4.1), Smog Index 9.1 (2.0) vs 12.0 (3.0), Coleman Liau Index 12.9 (2.1) vs 14.9 (3.7), and Automated Readability Index 11.1 (2.5) vs 12.0 (5.7; P < .0001 for all except Automated Readability Index, which was P 1/4 .037). The correctness rate of ChatGPT outputs was >85% across all categories assessed, with interrater agreement (Cohen's K) between 2 independent physician reviewers ranging from 0.76-0.95.Conclusions: ChatGPT can create accurate summaries of scientific abstracts for patients, with well-crafted prompts enhancing user-friendliness. Although the summaries are satisfactory, expert verification is necessary for improved accuracy.

引用

页码：435 / +

页数：9

共 30 条

[1] Exploring ChatGPT for information of cardiopulmonary resuscitation [J].

Ahn, Chiwon .

RESUSCITATION, 2023, 185

[2]

[Anonymous], 2023, Good Lay Summary Practice

[3]

[Anonymous], 2023, Summaries of Clinical Trial Results for Laypersons

[4] Assessing Readability of Patient Education Materials: Current Role in Orthopaedics [J].

Badarudeen, Sameer ;

Sabharwal, Sanjeev .

CLINICAL ORTHOPAEDICS AND RELATED RESEARCH, 2010, 468 (10) :2572-2580

[5]

Bhattacharya K., 2023, Indian J Surg, P1

[6] No difference in knowledge obtained from infographic or plain language summary of a Cochrane systematic review: three randomized controlled trials [J].

Buljan, Ivan ;

Malicki, Mario ;

Wager, Elizabeth ;

Puljak, Livia ;

Hren, Darko ;

Kellie, Frances ;

West, Helen ;

Alfirevic, Zarko ;

Marusic, Ana .

JOURNAL OF CLINICAL EPIDEMIOLOGY, 2018, 97 :86-94

[7] Asking "Dr. Google" for a Second Opinion: The Devil Is in the Details [J].

Cacciamani, Giovanni E. ;

Dell'Oglio, Paolo ;

Cocci, Andrea ;

Russo, Giorgio I. ;

Abreu, Andre De Castro ;

Gill, Inderbir S. ;

Briganti, Alberto ;

Artibani, Walter .

EUROPEAN UROLOGY FOCUS, 2021, 7 (02) :479-481

[8] Consulting "Dr. Google" for Prostate Cancer Treatment Options: A Contemporary Worldwide Trend Analysis [J].

Cacciamani, Giovanni E. ;

Bassi, Silvia ;

Sebben, Marco ;

Marcer, Anna ;

Russo, Giorgio, I ;

Cocci, Andrea ;

Dell'Oglio, Paolo ;

Medina, Luis G. ;

Nassiri, Nima ;

Tafuri, Alessandro ;

Abreu, Andre ;

Porcaro, Antonio B. ;

Briganti, Alberto ;

Montorsi, Francesco ;

Gill, Inderbir S. ;

Artibani, Walter .

EUROPEAN UROLOGY ONCOLOGY, 2020, 3 (04) :481-488

[9]

Cohen AC., 2010, Health Information Technology Use Among Men and Women Aged 18-64: Early Release of Estimates From the National Health Interview Survey, January-June 2009

[10] Literacy and health outcomes - A systematic review of the literature [J].

DeWalt, DA ;

Berkman, ND ;

Sheridan, S ;

Lohr, KN ;

Pignone, MP .

JOURNAL OF GENERAL INTERNAL MEDICINE, 2004, 19 (12) :1228-1239

← 1 2 3 →