Evaluating Diagnostic Accuracy and Treatment Efficacy in Mental Health: A Comparative Analysis of Large Language Model Tools and Mental Health Professionals

被引:2
作者
Levkovich, Inbar [1 ]
机构
[1] Tel Hai Acad Coll, Fac Educ, Upper Galilee 2208, Israel
关键词
large language models; artificial intelligence; mental health; depression; suicide; schizophrenia; social phobia; PTSD; BELIEFS;
D O I
10.3390/ejihpe15010009
中图分类号
B849 [应用心理学];
学科分类号
040203 ;
摘要
Large language models (LLMs) offer promising possibilities in mental health, yet their ability to assess disorders and recommend treatments remains underexplored. This quantitative cross-sectional study evaluated four LLMs (Gemini (Gemini 2.0 Flash Experimental), Claude (Claude 3.5 Sonnet), ChatGPT-3.5, and ChatGPT-4) using text vignettes representing conditions such as depression, suicidal ideation, early and chronic schizophrenia, social phobia, and PTSD. Each model's diagnostic accuracy, treatment recommendations, and predicted outcomes were compared with norms established by mental health professionals. Findings indicated that for certain conditions, including depression and PTSD, models like ChatGPT-4 achieved higher diagnostic accuracy compared to human professionals. However, in more complex cases, such as early schizophrenia, LLM performance varied, with ChatGPT-4 achieving only 55% accuracy, while other LLMs and professionals performed better. LLMs tended to suggest a broader range of proactive treatments, whereas professionals recommended more targeted psychiatric consultations and specific medications. In terms of outcome predictions, professionals were generally more optimistic regarding full recovery, especially with treatment, while LLMs predicted lower full recovery rates and higher partial recovery rates, particularly in untreated cases. While LLMs recommend a broader treatment range, their conservative recovery predictions, particularly for complex conditions, highlight the need for professional oversight. LLMs provide valuable support in diagnostics and treatment planning but cannot replace professional discretion.
引用
收藏
页数:19
相关论文
共 41 条
  • [1] Aich A, 2024, Arxiv, DOI arXiv:2406.12687
  • [2] Exploring the Role of Artificial Intelligence in Mental Healthcare: Current Trends and Future Directions - A Narrative Review for a Comprehensive Insight
    Alhuwaydi, Ahmed M.
    [J]. RISK MANAGEMENT AND HEALTHCARE POLICY, 2024, 17 : 1339 - 1348
  • [3] Revolutionizing healthcare: the role of artificial intelligence in clinical practice
    Alowais, Shuroug A.
    Alghamdi, Sahar S.
    Alsuhebany, Nada
    Alqahtani, Tariq
    Alshaya, Abdulrahman I.
    Almohareb, Sumaya N.
    Aldairem, Atheer
    Alrashed, Mohammed
    Bin Saleh, Khalid
    Badreldin, Hisham A.
    Al Yami, Majed S.
    Al Harbi, Shmeylan
    Albekairy, Abdulkareem M.
    [J]. BMC MEDICAL EDUCATION, 2023, 23 (01)
  • [4] Diagnostic error in mental health: a review
    Bradford, Andrea
    Meyer, Ashley
    Khan, Sundas
    Giardina, Traber D.
    Singh, Hardeep
    [J]. BMJ QUALITY & SAFETY, 2024, 33 (10) : 663 - 672
  • [5] Ethical Dilemmas, Mental Health, Artificial Intelligence, and LLM-Based Chatbots
    Cabrera, Johana
    Loyola, M. Soledad
    Magana, Irene
    Rojas, Rodrigo
    [J]. BIOINFORMATICS AND BIOMEDICAL ENGINEERING, IWBBIO 2023, PT II, 2023, 13920 : 313 - 326
  • [6] Using large language models in psychology
    Demszky, Dorottya
    Yang, Diyi
    Yeager, David
    Bryan, Christopher
    Clapper, Margarett
    Chandhok, Susannah
    Eichstaedt, Johannes
    Hecht, Cameron
    Jamieson, Jeremy
    Johnson, Meghann
    Jones, Michaela
    Krettek-Cobb, Danielle
    Lai, Leslie
    Jonesmitchell, Nirel
    Ong, Desmond
    Dweck, Carol
    Gross, James
    Pennebaker, James
    [J]. NATURE REVIEWS PSYCHOLOGY, 2023, 2 (11): : 688 - 701
  • [7] ChatGPT is not ready yet for use in providing mental health assessment and interventions
    Dergaa, Ismail
    Fekih-Romdhane, Feten
    Hallit, Souheil
    Loch, Alexandre Andrade
    Glenn, Jordan M.
    Fessi, Mohamed Saifeddin
    Ben Aissa, Mohamed
    Souissi, Nizar
    Guelmami, Noomen
    Swed, Sarya
    El Omri, Abdelfatteh
    Bragazzi, Nicola Luigi
    Ben Saad, Helmi
    [J]. FRONTIERS IN PSYCHIATRY, 2024, 14
  • [8] Comparing the Perspectives of Generative AI, Mental Health Experts, and the General Public on Schizophrenia Recovery: Case Vignette Study
    Elyoseph, Zohar
    Levkovich, Inbar
    [J]. JMIR MENTAL HEALTH, 2024, 11
  • [9] Assessing prognosis in depression: comparing perspectives of AI models, mental health professionals and the general public
    Elyoseph, Zohar
    Levkovich, Inbar
    Shinan-Altman, Shiri
    [J]. FAMILY MEDICINE AND COMMUNITY HEALTH, 2024, 12 (SUPPL_1)
  • [10] Beyond human expertise: the promise and limitations of ChatGPT in suicide risk assessment
    Elyoseph, Zohar
    Levkovich, Inbar
    [J]. FRONTIERS IN PSYCHIATRY, 2023, 14