Facebook language predicts depression in medical records

被引:341
作者
Eichstaedt, Johannes C. [1 ]
Smith, Robert J. [2 ]
Merchant, Raina M. [2 ,3 ]
Ungar, Lyle H. [1 ,2 ]
Crutchley, Patrick [1 ,2 ]
Preotiuc-Pietro, Daniel [1 ]
Asch, David A. [2 ,4 ]
Schwartz, H. Andrew [5 ]
机构
[1] Univ Penn, Posit Psychol Ctr, Philadelphia, PA 19104 USA
[2] Univ Penn, Penn Med Ctr Digital Hlth, Philadelphia, PA 19104 USA
[3] Univ Penn, Perelman Sch Med, Dept Emergency Med, Philadelphia, PA 19104 USA
[4] Philadelphia Vet Affairs Med Ctr, Ctr Hlth Equ Res & Promot, Philadelphia, PA 19104 USA
[5] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA
关键词
big data; depression; social media; Facebook; screening; ALL-CAUSE MORTALITY; PRIMARY-CARE; HEALTH; SEVERITY; SYMPTOMS;
D O I
10.1073/pnas.1802331115
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Depression, the most prevalent mental illness, is underdiagnosed and undertreated, highlighting the need to extend the scope of current screening methods. Here, we use language from Facebook posts of consenting individuals to predict depression recorded in electronic medical records. We accessed the history of Facebook statuses posted by 683 patients visiting a large urban academic emergency department, 114 of whom had a diagnosis of depression in their medical records. Using only the language preceding their first documentation of a diagnosis of depression, we could identify depressed patients with fair accuracy [area under the curve (AUC) = 0.69], approximately matching the accuracy of screening surveys benchmarked against medical records. Restricting Facebook data to only the 6 months immediately preceding the first documented diagnosis of depression yielded a higher prediction accuracy (AUC = 0.72) for those users who had sufficient Facebook data. Significant prediction of future depression status was possible as far as 3 months before its first documentation. We found that language predictors of depression include emotional (sadness), interpersonal (loneliness, hostility), and cognitive (preoccupation with the self, rumination) processes. Unobtrusive depression assessment through social media of consenting individuals may become feasible as a scalable complement to existing screening and monitoring procedures.
引用
收藏
页码:11203 / 11208
页数:6
相关论文
共 43 条
[1]  
[Anonymous], 2015, LIWC 2015 operators manual
[2]  
[Anonymous], 2014, P 17 ACM C COMP SUPP
[3]  
[Anonymous], 2015, P 2 WORKSHOP COMPUTA, DOI [DOI 10.3115/V1/W15-1203, 10.3115/v1/w15-1203]
[4]  
[Anonymous], 2002, MALLET: A machine learning for language toolkit
[5]  
[Anonymous], 2015, P 2 WORKSH COMP LING, DOI [DOI 10.3115/V1/W15-1204, 10.3115/v1/W15-1204]
[6]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[7]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[8]  
Boudreaux Edwin D, 2006, Prim Care Companion J Clin Psychiatry, V8, P66
[9]  
Coppersmith Glen, 2014, P WORKSHOP COMPUTATI, P51, DOI 10.3115/v1/W14-3207
[10]  
De Choudhury M., 2013, ICWSM, P1