DPD (DePression Detection) Net: a deep neural network for multimodal depression detection

被引:0
作者
He, Manlu [1 ]
Bakker, Erwin M. [1 ]
Lew, Michael S. [1 ]
机构
[1] Leiden Univ, Leiden Inst Adv Comp Sci LIACS, Niels Bohrweg 1, NL-2333 CA Leiden, Netherlands
来源
HEALTH INFORMATION SCIENCE AND SYSTEMS | 2024年 / 12卷 / 01期
关键词
Depression detection; Multimodal data; Deep neural network; Transformers; Graph neural networks; Ensemble model; RECOGNITION; FRAMEWORK;
D O I
10.1007/s13755-024-00311-9
中图分类号
R-058 [];
学科分类号
摘要
Depression is one of the most prevalent mental conditions which could impair people's productivity and lead to severe consequences. The diagnosis of this disease is complex as it often relies on a physician's subjective interview-based screening. The aim of our work is to propose deep learning models for automatic depression detection by using different data modalities, which could assist in the diagnosis of depression. Current works on automatic depression detection mostly are tested on a single dataset, which might lack robustness, flexibility and scalability. To alleviate this problem, we design a novel Graph Neural Network-enhanced Transformer model named DePressionDetect Net (DPD Net) that leverages textual, audio and visual features and can work under two different application settings: the clinical setting and the social media setting. The model consists of a unimodal encoder module for encoding single modality, a multimodal encoder module for integrating the multimodal information, and a detection module for producing the final prediction. We also propose a model named DePressionDetect-with-EEG Net (DPD-E Net) to incorporate Electroencephalography (EEG) signals and speech data for depression detection. Experiments across four benchmark datasets show that DPD Net and DPD-E Net can outperform the state-of-the-art models on three datasets (i.e., E-DAIC dataset, Twitter depression dataset and MODMA dataset), and achieve competitive performance on the fourth one (i.e., D-vlog dataset). Ablation studies demonstrate the advantages of the proposed modules and the effectiveness of combining diverse modalities for automatic depression detection.
引用
收藏
页数:17
相关论文
共 44 条
  • [11] Gong Y., 2017, P 7 ANN WORKSH AUD V, P69, DOI DOI 10.1145/3133944.3133945
  • [12] Gratch J, 2014, LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P3123
  • [13] Gui T, 2019, AAAI CONF ARTIF INTE, P110
  • [14] Conformer: Convolution-augmented Transformer for Speech Recognition
    Gulati, Anmol
    Qin, James
    Chiu, Chung-Cheng
    Parmar, Niki
    Zhang, Yu
    Yu, Jiahui
    Han, Wei
    Wang, Shibo
    Zhang, Zhengdong
    Wu, Yonghui
    Pang, Ruoming
    [J]. INTERSPEECH 2020, 2020, : 5036 - 5040
  • [15] Identity Mappings in Deep Residual Networks
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 630 - 645
  • [16] Joshi A, 2022, COGMEN: COntextualized GNN based multimodal emotion recognitioN, P4148, DOI [10.18653/v1/2022.naacl-main.306, DOI 10.18653/V1/2022.NAACL-MAIN.306]
  • [17] MHA: a multimodal hierarchical attention model for depression detection in social media
    Li, Zepeng
    An, Zhengyi
    Cheng, Wenchuan
    Zhou, Jiawei
    Zheng, Fang
    Hu, Bin
    [J]. HEALTH INFORMATION SCIENCE AND SYSTEMS, 2023, 11 (01)
  • [18] Liu YH, 2019, Arxiv, DOI [arXiv:1907.11692, DOI 10.48550/ARXIV.1907.11692, 10.48550/arXiv.1907.11692]
  • [19] Ethical Development of Digital Phenotyping Tools for Mental Health Applications: Delphi Study
    Martinez-Martin, Nicole
    Greely, Henry T.
    Cho, Mildred K.
    [J]. JMIR MHEALTH AND UHEALTH, 2021, 9 (07):
  • [20] Estimating the global treatment rates for depression: A systematic review and meta-analysis
    Mekonen, Tesfa
    Chan, Gary C. K.
    Connor, Jason P.
    Hides, Leanne
    Leung, Janni
    [J]. JOURNAL OF AFFECTIVE DISORDERS, 2021, 295 : 1234 - 1242