Extracting psychiatric stressors for suicide from social media using deep learning

被引:85
作者
Du, Jingcheng [1 ]
Zhang, Yaoyun [1 ]
Luo, Jianhong [1 ,2 ]
Jia, Yuxi [1 ,3 ]
Wei, Qiang [1 ]
Tao, Cui [1 ]
Xu, Hua [1 ]
机构
[1] Univ Texas Houston, Sch Biomed Informat, 7000 Fannin St Suite 600, Houston, TX 77030 USA
[2] Zhejiang Sci Tech Univ, Dept Management Sci & Engn, Hangzhou 310018, Zhejiang, Peoples R China
[3] Jilin Univ, Sch Publ Hlth, Dept Med Informat, Changchun 130021, Jilin, Peoples R China
基金
美国国家卫生研究院;
关键词
Suicide; Mental health; Psychiatric stressors; Social media; Deep learning; Named entity recognition;
D O I
10.1186/s12911-018-0632-8
中图分类号
R-058 [];
学科分类号
摘要
Background: Suicide has been one of the leading causes of deaths in the United States. One major cause of suicide is psychiatric stressors. The detection of psychiatric stressors in an at risk population will facilitate the early prevention of suicidal behaviors and suicide. In recent years, the widespread popularity and real-time information sharing flow of social media allow potential early intervention in a large-scale population. However, few automated approaches have been proposed to extract psychiatric stressors from Twitter. The goal of this study was to investigate techniques for recognizing suicide related psychiatric stressors from Twitter using deep learning based methods and transfer learning strategy which leverages an existing annotation dataset from clinical text. Methods: First, a dataset of suicide-related tweets was collected from Twitter streaming data with a multiple-step pipeline including keyword-based retrieving, filtering and further refining using an automated binary classifier. Specifically, a convolutional neural networks (CNN) based algorithm was used to build the binary classifier. Next, psychiatric stressors were annotated in the suicide-related tweets. The stressor recognition problem is conceptualized as a typical named entity recognition (NER) task and tackled using recurrent neural networks (RNN) based methods. Moreover, to reduce the annotation cost and improve the performance, transfer learning strategy was adopted by leveraging existing annotation from clinical text. Results & conclusions: To our best knowledge, this is the first effort to extract psychiatric stressors from Twitter data using deep learning based approaches. Comparison to traditional machine learning algorithms shows the superiority of deep learning based approaches. CNN is leading the performance at identifying suicide-related tweets with a precision of 78% and an F-1 measure of 83%, outperforming Support Vector Machine (SVM), Extra Trees (ET), etc. RNN based psychiatric stressors recognition obtains the best F-1 measure of 53.25% by exact match and 67. 94% by inexact match, outperforming Conditional Random Fields (CRF). Moreover, transfer learning from clinical notes for the Twitter corpus outperforms the training with Twitter corpus only with an F-1 measure of 54.9% by exact match. The results indicate the advantages of deep learning based methods for the automated stressors recognition from social media.
引用
收藏
页数:11
相关论文
共 38 条
[1]  
Abboute Amayas, 2014, Natural Language Processing and Information Systems. 19th International Conference on Applications of Natural Language to Information Systems, NLDB 2014. Proceedings: LNCS 8455, P250
[2]  
American Foundation for Suicide Prevention, 2019, SUIC STAT
[3]  
[Anonymous], 2017, ARXIV170506273
[4]  
[Anonymous], 2014, C EMPIRICAL METHODS
[5]  
[Anonymous], 2016, SUICIDE UNNECESSARY
[6]  
[Anonymous], 2016, P 3 WORKSH COMP LING, DOI [10.18653/v1/w16-0311, DOI 10.18653/V1/W16-0311]
[7]  
[Anonymous], 2014, P COLING 2014 25 INT, DOI DOI 10.1109/ICCAR.2017.7942788
[8]  
[Anonymous], 2017, ARXIV170505487
[9]  
[Anonymous], 2016, arXiv:1606.01781
[10]   Multi-class machine classification of suicide-related communication on Twitter [J].
Burnap, Pete ;
Colombo, Gualtiero ;
Amery, Rosie ;
Hodorog, Andrei ;
Scourfield, Jonathan .
Online Social Networks and Media, 2017, 2 :32-44