DPD (DePression Detection) Net: a deep neural network for multimodal depression detection

被引：0

作者：

He, Manlu ^{[1
]}

Bakker, Erwin M. ^{[1
]}

Lew, Michael S. ^{[1
]}

机构：

[1] Leiden Univ, Leiden Inst Adv Comp Sci LIACS, Niels Bohrweg 1, NL-2333 CA Leiden, Netherlands

来源：

HEALTH INFORMATION SCIENCE AND SYSTEMS | 2024年 / 12卷 / 01期

关键词：

Depression detection; Multimodal data; Deep neural network; Transformers; Graph neural networks; Ensemble model; RECOGNITION; FRAMEWORK;

D O I：

10.1007/s13755-024-00311-9

中图分类号：

R-058 [];

学科分类号：

摘要：

Depression is one of the most prevalent mental conditions which could impair people's productivity and lead to severe consequences. The diagnosis of this disease is complex as it often relies on a physician's subjective interview-based screening. The aim of our work is to propose deep learning models for automatic depression detection by using different data modalities, which could assist in the diagnosis of depression. Current works on automatic depression detection mostly are tested on a single dataset, which might lack robustness, flexibility and scalability. To alleviate this problem, we design a novel Graph Neural Network-enhanced Transformer model named DePressionDetect Net (DPD Net) that leverages textual, audio and visual features and can work under two different application settings: the clinical setting and the social media setting. The model consists of a unimodal encoder module for encoding single modality, a multimodal encoder module for integrating the multimodal information, and a detection module for producing the final prediction. We also propose a model named DePressionDetect-with-EEG Net (DPD-E Net) to incorporate Electroencephalography (EEG) signals and speech data for depression detection. Experiments across four benchmark datasets show that DPD Net and DPD-E Net can outperform the state-of-the-art models on three datasets (i.e., E-DAIC dataset, Twitter depression dataset and MODMA dataset), and achieve competitive performance on the fourth one (i.e., D-vlog dataset). Ablation studies demonstrate the advantages of the proposed modules and the effectiveness of combining diverse modalities for automatic depression detection.

引用

页数：17

共 44 条

[11] Gong Y., 2017, P 7 ANN WORKSH AUD V, P69, DOI DOI 10.1145/3133944.3133945
[12] Gratch J, 2014, LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P3123
[13] Gui T, 2019, AAAI CONF ARTIF INTE, P110
[14] Conformer: Convolution-augmented Transformer for Speech Recognition
Gulati, Anmol
Qin, James
Chiu, Chung-Cheng
Parmar, Niki
Zhang, Yu
Yu, Jiahui
Han, Wei
Wang, Shibo
Zhang, Zhengdong
Wu, Yonghui
Pang, Ruoming
[J]. INTERSPEECH 2020, 2020, : 5036 - 5040
[15] Identity Mappings in Deep Residual Networks
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 630 - 645
[16] Joshi A, 2022, COGMEN: COntextualized GNN based multimodal emotion recognitioN, P4148, DOI [10.18653/v1/2022.naacl-main.306, DOI 10.18653/V1/2022.NAACL-MAIN.306]
[17] MHA: a multimodal hierarchical attention model for depression detection in social media
Li, Zepeng
An, Zhengyi
Cheng, Wenchuan
Zhou, Jiawei
Zheng, Fang
Hu, Bin
[J]. HEALTH INFORMATION SCIENCE AND SYSTEMS, 2023, 11 (01)
[18] Liu YH, 2019, Arxiv, DOI [arXiv:1907.11692, DOI 10.48550/ARXIV.1907.11692, 10.48550/arXiv.1907.11692]
[19] Ethical Development of Digital Phenotyping Tools for Mental Health Applications: Delphi Study
Martinez-Martin, Nicole
Greely, Henry T.
Cho, Mildred K.
[J]. JMIR MHEALTH AND UHEALTH, 2021, 9 (07):
[20] Estimating the global treatment rates for depression: A systematic review and meta-analysis
Mekonen, Tesfa
Chan, Gary C. K.
Connor, Jason P.
Hides, Leanne
Leung, Janni
[J]. JOURNAL OF AFFECTIVE DISORDERS, 2021, 295 : 1234 - 1242

← 1 2 3 4 5 →