Machine learning for passive mental health symptom prediction: Generalization across different longitudinal mobile sensing studies

被引:30
作者
Adler, Daniel A. [1 ]
Wang, Fei [2 ]
Mohr, David C. [3 ]
Choudhury, Tanzeem [1 ]
机构
[1] Cornell Tech, Dept Informat Sci, New York, NY 10044 USA
[2] Weill Cornell Med, Dept Populat Hlth Sci, New York, NY USA
[3] Northwestern Univ, Feinberg Sch Med, Dept Prevent Med, Ctr Behav Intervent Technol, Chicago, IL 60611 USA
基金
美国国家科学基金会;
关键词
DEPRESSION;
D O I
10.1371/journal.pone.0266516
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Mobile sensing data processed using machine learning models can passively and remotely assess mental health symptoms from the context of patients' lives. Prior work has trained models using data from single longitudinal studies, collected from demographically homogeneous populations, over short time periods, using a single data collection platform or mobile application. The generalizability of model performance across studies has not been assessed. This study presents a first analysis to understand if models trained using combined longitudinal study data to predict mental health symptoms generalize across current publicly available data. We combined data from the CrossCheck (individuals living with schizophrenia) and StudentLife (university students) studies. In addition to assessing generalizability, we explored if personalizing models to align mobile sensing data, and oversampling less-represented severe symptoms, improved model performance. Leave-one-subject-out cross-validation (LOSO-CV) results were reported. Two symptoms (sleep quality and stress) had similar question-response structures across studies and were used as outcomes to explore cross-dataset prediction. Models trained with combined data were more likely to be predictive (significant improvement over predicting training data mean) than models trained with single-study data. Expected model performance improved if the distance between training and validation feature distributions decreased using combined versus single-study data. Personalization aligned each LOSO-CV participant with training data, but only improved predicting CrossCheck stress. Oversampling significantly improved severe symptom classification sensitivity and positive predictive value, but decreased model specificity. Taken together, these results show that machine learning models trained on combined longitudinal study data may generalize across heterogeneous datasets. We encourage researchers to disseminate collected de-identified mobile sensing and mental health symptom data, and further standardize data types collected across studies to enable better assessment of model generalizability.
引用
收藏
页数:20
相关论文
共 58 条
[1]   Towards Circadian Computing: "Early to Bed and Early to Rise" Makes Some of Us Unhealthy and Sleep Deprived [J].
Abdullah, Saeed ;
Matthews, Mark ;
Murnane, Elizabeth L. ;
Gay, Geri ;
Choudhury, Tanzeem .
UBICOMP'14: PROCEEDINGS OF THE 2014 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING, 2014, :673-684
[2]   A call for open data to develop mental health digital biomarkers [J].
Adler, Daniel A. ;
Wang, Fei ;
Mohr, David C. ;
Estrin, Deborah ;
Livesey, Cecilia ;
Choudhury, Tanzeem .
BJPSYCH OPEN, 2022, 8 (02)
[3]   Identifying Mobile Sensing Indicators of Stress-Resilience [J].
Adler, Daniel A. ;
Tseng, Vincent W-S ;
Qi, Gengmo ;
Scarpa, Joseph ;
Sen, Srijan ;
Choudhury, Tanzeem .
PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2021, 5 (02)
[4]   Predicting Early Warning Signs of Psychotic Relapse From Passive Sensing Data: An Approach Using Encoder-Decoder Neural Networks [J].
Adler, Daniel A. ;
Ben-Zeev, Dror ;
Tseng, Vincent W-S ;
Kane, John M. ;
Brian, Rachel ;
Campbell, Andrew T. ;
Hauser, Marta ;
Scherer, Emily A. ;
Choudhury, Tanzeem .
JMIR MHEALTH AND UHEALTH, 2020, 8 (08)
[5]   Predicting Depression From Smartphone Behavioral Markers Using Machine Learning Methods, Hyperparameter Optimization, and Feature Importance Analysis: Exploratory Study [J].
Asare, Kennedy Opoku ;
Terhorst, Yannik ;
Vega, Julio ;
Peltonen, Ella ;
Lagerspetz, Eemil ;
Ferreira, Denzil .
JMIR MHEALTH AND UHEALTH, 2021, 9 (07)
[6]   A Decade of Ubiquitous Computing Research in Mental Health [J].
Bardram, Jakob E. ;
Matic, Aleksandar .
IEEE PERVASIVE COMPUTING, 2020, 19 (01) :62-72
[7]   Beyond smartphones and sensors: choosing appropriate statistical methods for the analysis of longitudinal data [J].
Barnett, Ian ;
Torous, John ;
Staples, Patrick ;
Keshavan, Matcheri ;
Onnela, Jukka-Pekka .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2018, 25 (12) :1669-1674
[8]  
Ben-David S., ANAL REPRESENTATIONS, V8
[9]   Psychological pathways to depression in schizophrenia - Studies in acute psychosis, post psychotic depression and auditory hallucinations [J].
Birchwood, M ;
Iqbal, Z ;
Upthegrove, R .
EUROPEAN ARCHIVES OF PSYCHIATRY AND CLINICAL NEUROSCIENCE, 2005, 255 (03) :202-212
[10]   Detecting relapse in youth with psychotic disorders utilizing patient-generated and patient-contributed digital data from Facebook [J].
Birnbaum, M. L. ;
Ernala, S. K. ;
Rizvi, A. F. ;
Arenare, E. ;
Van Meter, A. R. ;
De Choudhury, M. ;
Kane, J. M. .
NPJ SCHIZOPHRENIA, 2019, 5 (1)