Evaluation of nutritional status and clinical depression classification using an explainable machine learning method

被引:7
作者
Hosseinzadeh Kasani, Payam [1 ,2 ]
Lee, Jung Eun [2 ]
Park, Chihyun [2 ,3 ]
Yun, Cheol-Heui [4 ,5 ]
Jang, Jae-Won [1 ,6 ]
Lee, Sang-Ah [2 ,7 ]
机构
[1] Kangwon Natl Univ Hosp, Dept Neurol, Chunchon, South Korea
[2] Kangwon Natl Univ, Interdisciplinary Grad Program Med Bigdata Conver, Chunchon, South Korea
[3] Kangwon Natl Univ, Dept Comp Sci & Engn, Chunchon, South Korea
[4] Seoul Natl Univ, Dept Agr Biotechnol, Seoul, South Korea
[5] Seoul Natl Univ, Res Inst Agr & Life Sci, Seoul, South Korea
[6] Kangwon Natl Univ, Dept Neurol, Sch Med, Chunchon, South Korea
[7] Kangwon Natl Univ, Coll Med, Dept Prevent Med, Chunchon, South Korea
来源
FRONTIERS IN NUTRITION | 2023年 / 10卷
基金
新加坡国家研究基金会;
关键词
depression; nutrition; machine learning; classification; interpretability; clinical depression; CHRONIC DISEASES; ASSOCIATION; REGRESSION; HEALTH; AGE; AI;
D O I
10.3389/fnut.2023.1165854
中图分类号
R15 [营养卫生、食品卫生]; TS201 [基础科学];
学科分类号
100403 ;
摘要
IntroductionDepression is a prevalent disorder worldwide, with potentially severe implications. It contributes significantly to an increased risk of diseases associated with multiple risk factors. Early accurate diagnosis of depressive symptoms is a critical first step toward management, intervention, and prevention. Various nutritional and dietary compounds have been suggested to be involved in the onset, maintenance, and severity of depressive disorders. Despite the challenges to better understanding the association between nutritional risk factors and the occurrence of depression, assessing the interplay of these markers through supervised machine learning remains to be fully explored. MethodsThis study aimed to determine the ability of machine learning-based decision support methods to identify the presence of depression using publicly available health data from the Korean National Health and Nutrition Examination Survey. Two exploration techniques, namely, uniform manifold approximation and projection and Pearson correlation, were performed for explanatory analysis among datasets. A grid search optimization with cross-validation was performed to fine-tune the models for classifying depression with the highest accuracy. Several performance measures, including accuracy, precision, recall, F1 score, confusion matrix, areas under the precision-recall and receiver operating characteristic curves, and calibration plot, were used to compare classifier performances. We further investigated the importance of the features provided: visualized interpretation using ELI5, partial dependence plots, and local interpretable using model-agnostic explanations and Shapley additive explanation for the prediction at both the population and individual levels. ResultsThe best model achieved an accuracy of 86.18% for XGBoost and an area under the curve of 84.96% for the random forest model in original dataset and the XGBoost algorithm with an accuracy of 86.02% and an area under the curve of 85.34% in the quantile-based dataset. The explainable results revealed a complementary observation of the relative changes in feature values, and, thus, the importance of emergent depression risks could be identified. DiscussionThe strength of our approach is the large sample size used for training with a fine-tuned model. The machine learning-based analysis showed that the hyper-tuned model has empirically higher accuracy in classifying patients with depressive disorder, as evidenced by the set of interpretable experiments, and can be an effective solution for disease control.
引用
收藏
页数:22
相关论文
共 89 条
  • [1] Prospective association between ultra-processed food consumption and incident depressive symptoms in the French NutriNet-Sante cohort
    Adjibade, Moufidath
    Julia, Chantal
    Alles, Benjamin
    Touvier, Mathilde
    Lemogne, Cedric
    Srour, Bernard
    Hercberg, Serge
    Galan, Pilar
    Assmann, Karen E.
    Kesse-Guyot, Emmanuelle
    [J]. BMC MEDICINE, 2019, 17 (1)
  • [2] Permutation importance: a corrected feature importance measure
    Altmann, Andre
    Tolosi, Laura
    Sander, Oliver
    Lengauer, Thomas
    [J]. BIOINFORMATICS, 2010, 26 (10) : 1340 - 1347
  • [3] Food and mood: how do diet and nutrition affect mental wellbeing? (vol 369, m2382, 2020)
    Borsini, Alessandra
    [J]. BMJ-BRITISH MEDICAL JOURNAL, 2020, 371
  • [4] [Anonymous], 2012, Depression: Fact sheet No. 369
  • [5] [Anonymous], GLOBAL DEPRESSION ST
  • [6] A survey of cross-validation procedures for model selection
    Arlot, Sylvain
    Celisse, Alain
    [J]. STATISTICS SURVEYS, 2010, 4 : 40 - 79
  • [7] Morbidity in Depressive Disorders
    Baldessarini, Ross J.
    Forte, Alberto
    Selle, Valerio
    Sim, Kang
    Tondo, Leonardo
    Undurraga, Juan
    Vazquez, Gustavo H.
    [J]. PSYCHOTHERAPY AND PSYCHOSOMATICS, 2017, 86 (02) : 65 - 72
  • [8] Bengfort B., 2017, YELLOWBRICK MACHINE
  • [9] A Machine Learning Prediction Model of Respiratory Failure Within 48 Hours of Patient Admission for COVID-19: Model Development and Validation
    Bolourani, Siavash
    Brenner, Max
    Wang, Ping
    McGinn, Thomas
    Hirsch, Jamie S.
    Barnaby, Douglas
    Zanos, Theodoros P.
    [J]. JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (02)
  • [10] Boyd Kendrick, 2013, Machine Learning and Knowledge Discovery in Databases. European Conference, ECML PKDD 2013. Proceedings: LNCS 8190, P451, DOI 10.1007/978-3-642-40994-3_29