Investigating the interpretability of ChatGPT in mental health counseling: An analysis of artificial intelligence generated content differentiation

被引:0
作者
Liu, Yang [1 ]
Wang, Fan [1 ]
机构
[1] Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金;
关键词
Interpretability analysis; Mental health; Machine learning; AIGC; Psychological counseling; Large language models; Topic modeling; DISORDERS; SEEKING;
D O I
10.1016/j.cmpb.2025.108864
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The global impact of COVID-19 has caused a significant rise in the demand for psychological counseling services, creating pressure on existing mental health professionals. Large language models (LLM), like ChatGPT, are considered a novel solution for delivering online psychological counseling. However, performance evaluation, emotional expression, high levels of anthropomorphism, ethical issues, transparency, and privacy breaches need to be addressed before LLM can be widely adopted. This study aimed to evaluate ChatGPT's effectiveness and emotional support capabilities in providing mental health counseling services from both macro and micro perspectives to examine whether it possesses psychological support abilities comparable to those of human experts. Building on the macro-level evaluation, we conducted a deeper comparison of the linguistic differences between ChatGPT and human experts at the microlevel. In addition, to respond to current policy requirements regarding the labeling, we further explored how to identify artificial intelligence generated content (AIGC) in counseling texts and which micro-level linguistic features can effectively distinguish AIGC from user-generated content (UGC). Finally, the study addressed transparency, privacy breaches, and ethical concerns. We utilized ChatGPT for psychological interventions, applying LLM to address various mental health issues. The BERTopic algorithm evaluated the content across multiple mental health problems. Deep learning techniques were employed to differentiate between AIGC and UGC in psychological counseling responses. Furthermore, Local Interpretable Model-agnostic Explanation (LIME) and SHapley Additive exPlanations (SHAP) evaluate interpretability, providing deeper insights into the decision-making process and enhancing transparency. At the macro level, ChatGPT demonstrated performance comparable to human experts, exhibiting professionalism, diversity, empathy, and a high degree of human likeness, making it highly effective in counseling services. At the micro level, deep learning models achieved accuracy rates of 99.12 % and 96.13 % in distinguishing content generated by ChatGPT 3.5 and ChatGPT 4.0 from UGC, respectively. Interpretability analysis revealed that context, sentence structure, and emotional expression were key factors differentiating AIGC from UGC. The findings highlight ChatGPT's potential to deliver effective online psychological counseling and demonstrate a reliable framework for distinguishing between artificial intelligence-generated and human-generated content. This study underscores the importance of leveraging large-scale language models to support mental health services while addressing high-level anthropomorphic issues and ethical and practical challenges.
引用
收藏
页数:32
相关论文
共 83 条
[1]   Large language models (LLM) and ChatGPT: what will the impact on nuclear medicine be? [J].
Alberts, Ian L. ;
Mercolli, Lorenzo ;
Pyka, Thomas ;
Prenosil, George ;
Shi, Kuangyu ;
Rominger, Axel ;
Afshar-Oromieh, Ali .
EUROPEAN JOURNAL OF NUCLEAR MEDICINE AND MOLECULAR IMAGING, 2023, 50 (06) :1549-1552
[2]   Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum [J].
Ayers, John W. ;
Poliak, Adam ;
Dredze, Mark ;
Leas, Eric C. ;
Zhu, Zechariah ;
Kelley, Jessica B. ;
Faix, Dennis J. ;
Goodman, Aaron M. ;
Longhurst, Christopher A. ;
Hogarth, Michael ;
Smith, Davey M. .
JAMA INTERNAL MEDICINE, 2023, 183 (06) :589-596
[3]   Current and Future Trends in Internet-Supported Mental Health Interventions [J].
Barak, Azy ;
Grohol, John M. .
JOURNAL OF TECHNOLOGY IN HUMAN SERVICES, 2011, 29 (03) :155-196
[4]   Exploring the Efficacy of Robotic Assistants with ChatGPT and Claude in Enhancing ADHD Therapy: Innovating Treatment Paradigms [J].
Berrezueta-Guzman, Santiago ;
Kandil, Mohanad ;
Martin-Ruiz, Maria-Luisa ;
de la Cruz, Ivan Pau ;
Krusche, Stephan .
2024 INTERNATIONAL CONFERENCE ON INTELLIGENT ENVIRONMENTS, IE 2024, 2024, :25-32
[5]   Future of ADHD Care: Evaluating the Efficacy of ChatGPT in Therapy Enhancement [J].
Berrezueta-Guzman, Santiago ;
Kandil, Mohanad ;
Martin-Ruiz, Maria-Luisa ;
de la Cruz, Ivan Pau ;
Krusche, Stephan .
HEALTHCARE, 2024, 12 (06)
[6]   Globalisation and mental disorders - Overview with relation to depression [J].
Bhugra, D ;
Mastrogianni, A .
BRITISH JOURNAL OF PSYCHIATRY, 2004, 184 :10-20
[7]   Psychiatrists' experiences and opinions of generative artificial intelligence in mental healthcare: An online mixed methods survey [J].
Blease, Charlotte ;
Worthen, Abigail ;
Torous, John .
PSYCHIATRY RESEARCH, 2024, 333
[8]   Using Machine Learning to Advance Personality Assessment and Theory [J].
Bleidorn, Wiebke ;
Hopwood, Christopher James .
PERSONALITY AND SOCIAL PSYCHOLOGY REVIEW, 2019, 23 (02) :190-203
[9]  
Canini K.R., 2009, Online inference of topics with Latent dirichlet allocation
[10]  
Cao YH, 2023, Arxiv, DOI [arXiv:2303.04226, 10.48550/arXiv.2303.04226, DOI 10.48550/ARXIV.2303.04226]