Automated speech-based screening of depression using deep convolutional neural networks

被引:30
作者
Chlasta, Karol [1 ,2 ]
Wolk, Krzysztof [1 ]
Krejtz, Izabela [2 ]
机构
[1] Polish Japanese Acad Informat Technol, Koszykowa 86, PL-02008 Warsaw, Poland
[2] SWPS Univ Social Sci & Humanities, Chodakowska 19-31, PL-03815 Warsaw, Poland
来源
CENTERIS2019--INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS/PROJMAN2019--INTERNATIONAL CONFERENCE ON PROJECT MANAGEMENT/HCIST2019--INTERNATIONAL CONFERENCE ON HEALTH AND SOCIAL CARE INFORMATION SYSTEMS AND TECHNOLOGIES | 2019年 / 164卷
关键词
Deep learning; Convolutional neural network; Resnet; Early detection; Depression; Telemedicine; Public health;
D O I
10.1016/j.procs.2019.12.228
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Early detection and treatment of depression is essential in promoting remission, preventing relapse, and reducing the emotional burden of the disease. Current diagnoses are primarily subjective, inconsistent across professionals, and expensive for individuals who may be in urgent need of help. This paper proposes a novel approach to automated depression detection in speech using convolutional neural network (CNN) and multipart interactive training. The model was tested using 2568 voice samples obtained from 77 non-depressed and 30 depressed individuals. In experiment conducted, data were applied to residual CNNs in the form of spectrograms images auto-generated from audio samples. The experimental results obtained using different ResNet architectures gave a promising baseline accuracy reaching 77%. (C) 2019 The Authors. Published by Elsevier B.V.
引用
收藏
页码:618 / 628
页数:11
相关论文
共 28 条
[1]   Effectiveness of Voice Quality Features in Detecting Depression [J].
Afshan, Amber ;
Guo, Jinxi ;
Park, Soo Jin ;
Ravi, Vijay ;
Flint, Jonathan ;
Alwan, Abeer .
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, :1676-1680
[2]   Detecting Depression with Audio/Text Sequence Modeling of Interviews [J].
Alhanai, Tuka ;
Ghassemi, Mohammad ;
Glass, James .
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, :1716-1720
[3]  
[Anonymous], 2017, P 7 ANN WORKSH AUD V
[4]  
[Anonymous], P 13 INT C AUT AG MU
[5]  
[Anonymous], 2018, Classification assessment methods: a detailed tutorial, DOI DOI 10.1016/J.ACI.2018.08.003
[6]  
[Anonymous], 2017, INT C LEARN REPR
[7]  
[Anonymous], 2014, 15 ANN C INT SPEECH
[8]  
[Anonymous], 2018, PRETERM BIRTH
[9]  
Brownlee Jason, 2019, USE TEST TIME AUGMEN
[10]   A review of depression and suicide risk assessment using speech analysis [J].
Cummins, Nicholas ;
Scherer, Stefan ;
Krajewski, Jarek ;
Schnieder, Sebastian ;
Epps, Julien ;
Quatieri, Thomas F. .
SPEECH COMMUNICATION, 2015, 71 :10-49