Evaluating the quality of medical content on YouTube using large language models

被引:0
作者
Khalil, Mahmoud [1 ]
Mohamed, Fatma [2 ]
Shoufan, Abdulhadi [2 ,3 ]
机构
[1] Western Univ, Comp Sci Dept, London, ON, Canada
[2] Khalifa Univ, Ctr Cyber Phys Syst C2PS, Comp Sci Dept, Abu Dhabi, U Arab Emirates
[3] Khalifa Univ, Comp & Informat Engn Dept, Abu Dhabi, U Arab Emirates
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
关键词
LLMs; Medical content; Content quality; YouTube; INFORMATION;
D O I
10.1038/s41598-025-94208-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
YouTube has become a dominant source of medical information and health-related decision-making. Yet, many videos on this platform contain inaccurate or biased information. Although expert reviews could help mitigate this situation, the vast number of daily uploads makes this solution impractical. In this study, we explored the potential of Large Language Models (LLMs) to assess the quality of medical content on YouTube. We collected a set of videos previously evaluated by experts and prompted twenty models to rate their quality using the DISCERN instrument. We then analyzed the inter-rater agreement between the language models' and experts' ratings using Brennan-Prediger's (BP) Kappa. We found that LLMs exhibited a wide range of inter-rater agreements with the experts (ranging from -1.10 to 0.82). All models tended to give higher scores than the human experts. The agreement on individual questions tended to be lower, with some questions showing significant disagreement between models and experts. Including scoring guidelines in the prompt has improved model performance. We conclude that some LLMs are capable of evaluating the quality of medical videos. If used as stand-alone expert systems or embedded into traditional recommender systems, these models can mitigate the quality issue of health-related online videos.
引用
收藏
页数:12
相关论文
共 47 条
  • [1] Anibal J.T., 2024, medRxiv
  • [2] Large language models: a new chapter in digital health
    The Lancet Digital Health
    [J]. The Lancet Digital Health, 2024, 6 (01):
  • [3] Reliability and Quality of Online Patient Education Videos for Spina Bifida
    Arpa, Abdurrahman
    Ozturk, Pinar Aydin
    [J]. WORLD NEUROSURGERY, 2023, 177 : E368 - E377
  • [4] YouTube and pudendal neuralgia: Is it a good source of information for patients?
    Bello, Juan Sebastian Reyes
    Moscote-Salazar, Luis Rafael
    Florez-Perdomo, William Andres
    Lugo, Claudia Marcela Restrepo
    Hanna, Amgad
    [J]. CLINICAL NEUROLOGY AND NEUROSURGERY, 2023, 233
  • [5] A systematic review of patient inflammatory bowel disease information resources on the world wide web
    Bernard, Andre
    Langille, Morgan
    Hughes, Stephanie
    Rose, Caren
    Leddin, Desmond
    van Zanten, Sander Veldhuyzen
    [J]. AMERICAN JOURNAL OF GASTROENTEROLOGY, 2007, 102 (09) : 2070 - 2077
  • [6] COEFFICIENT KAPPA - SOME USES, MISUSES, AND ALTERNATIVES
    BRENNAN, RL
    PREDIGER, DJ
    [J]. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1981, 41 (03) : 687 - 699
  • [7] DISCERN: an instrument for judging the quality of written consumer health information on treatment choices
    Charnock, D
    Shepperd, S
    Needham, G
    Gann, R
    [J]. JOURNAL OF EPIDEMIOLOGY AND COMMUNITY HEALTH, 1999, 53 (02) : 105 - 111
  • [8] Cluster headache - The worst possible pain on YouTube
    Chaudhry, Basit Ali
    Thien Phu Do
    Ashina, Hakan
    Ashina, Messoud
    Amin, Faisal Mohammad
    [J]. HEADACHE, 2022, 62 (09): : 1222 - 1226
  • [9] Detecting Low-Credibility Medical Websites Through Semi-Supervised Learning Techniques
    Fernandez, Cesar Gonzalez
    de Diego, Isaac Martin
    Fernandez-Isabel, Alberto
    Viseu, Juan Fernando Jimenez
    Barriuso, Adrian Alonso
    [J]. IEEE ACCESS, 2023, 11 : 142464 - 142477
  • [10] Large Language Models for Binary Health-Related Question Answering: A Zero- and Few-Shot Evaluation
    Fernandez-Pichel, Marcos
    Losada, David E.
    Pichel, Juan C.
    [J]. COMPUTATIONAL SCIENCE, ICCS 2024, PT IV, 2024, 14835 : 325 - 339