Schizophrenia Detection Using Machine Learning Approach from Social Media Content

被引:33
作者
Bae, Yi Ji [1 ]
Shim, Midan [1 ,2 ]
Lee, Won Hee [1 ]
机构
[1] Kyung Hee Univ, Dept Software Convergence, Yongin 17104, South Korea
[2] Kyung Hee Univ, Dept Biol, Seoul 02447, South Korea
基金
新加坡国家研究基金会;
关键词
social media; Reddit; schizophrenia; natural language processing; machine learning; topic modeling; linguistic inquiry and word count; TREATMENT OUTCOMES; PREDICTION;
D O I
10.3390/s21175924
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Schizophrenia is a severe mental disorder that ranks among the leading causes of disability worldwide. However, many cases of schizophrenia remain untreated due to failure to diagnose, self-denial, and social stigma. With the advent of social media, individuals suffering from schizophrenia share their mental health problems and seek support and treatment options. Machine learning approaches are increasingly used for detecting schizophrenia from social media posts. This study aims to determine whether machine learning could be effectively used to detect signs of schizophrenia in social media users by analyzing their social media texts. To this end, we collected posts from the social media platform Reddit focusing on schizophrenia, along with non-mental health related posts (fitness, jokes, meditation, parenting, relationships, and teaching) for the control group. We extracted linguistic features and content topics from the posts. Using supervised machine learning, we classified posts belonging to schizophrenia and interpreted important features to identify linguistic markers of schizophrenia. We applied unsupervised clustering to the features to uncover a coherent semantic representation of words in schizophrenia. We identified significant differences in linguistic features and topics including increased use of third person plural pronouns and negative emotion words and symptom-related topics. We distinguished schizophrenic from control posts with an accuracy of 96%. Finally, we found that coherent semantic groups of words were the key to detecting schizophrenia. Our findings suggest that machine learning approaches could help us understand the linguistic characteristics of schizophrenia and identify schizophrenia or otherwise at-risk individuals using social media texts.
引用
收藏
页数:18
相关论文
共 53 条
[1]  
[Anonymous], 2016, Fasttext.zip: Compressing text classification models
[2]  
APA, 2013, Diagnostic and Statistical Manual of Mental Disorders: DSM-V, V5th
[3]  
Benoit K, 2018, J OPEN SOURCE SOFTWA, V3, P774, DOI [10.21105/joss.00774, DOI 10.21105/JOSS.00774]
[4]  
Benton A., 2017, MULTITASK LEARNING M, P152
[5]   A Collaborative Approach to Identifying Social Media Markers of Schizophrenia by Employing Machine Learning and Clinical Appraisals [J].
Birnbaum, Michael L. ;
Ernala, Sindhu Kiranmai ;
Rizvi, Asra F. ;
De Choudhury, Munmun ;
Kane, John M. .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2017, 19 (08)
[6]  
Bishop C.M., 2006, Pattern Recognition and Machine Learning, DOI DOI 10.1007/978-0-387-45528-0
[7]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[8]   Lexical Characteristics of Emotional Narratives in Schizophrenia Relationships With Symptoms, Functioning, and Social Cognition [J].
Buck, Benjamin ;
Penn, David L. .
JOURNAL OF NERVOUS AND MENTAL DISEASE, 2015, 203 (09) :702-708
[9]   Psychiatric Comorbidities and Schizophrenia [J].
Buckley, Peter F. ;
Miller, Brian J. ;
Lehrer, Douglas S. ;
Castle, David J. .
SCHIZOPHRENIA BULLETIN, 2009, 35 (02) :383-402
[10]   Methods in predictive techniques for mental health status on social media: a critical review [J].
Chancellor, Stevie ;
De Choudhury, Munmun .
NPJ DIGITAL MEDICINE, 2020, 3 (01)