Decision Tree Based Depression Classification from Audio Video and Language Information

被引:89
作者
Yang, Le [1 ]
Jiang, Dongmei [1 ]
He, Lang [1 ]
Pei, Ercheng [1 ]
Oveneke, Meshia Cedric [2 ]
Sahli, Hichem [3 ,4 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, NPU VUB Joint AVSP Lab, 127 Youyi Xilu, Xian 710072, Peoples R China
[2] Vrije Univ Brussel, ETRO, Dept Elect & Informat, NPU VUB Joint AVSP Lab, Pl Laan 2, B-1050 Brussels, Belgium
[3] VUB, Dept ETRO, NPU VUB Joint AVSP Lab, Pl Laan 2, B-1050 Brussels, Belgium
[4] Interuniv Microelect Ctr, Kepeldreef 75, B-3001 Heverlee, Belgium
来源
PROCEEDINGS OF THE 6TH INTERNATIONAL WORKSHOP ON AUDIO/VISUAL EMOTION CHALLENGE (AVEC'16) | 2016年
关键词
Depression classification; decision tree; multi-modal;
D O I
10.1145/2988257.2988269
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In order to improve the recognition accuracy of the Depression Classification Sub-Challenge (DCC) of the AVEC 2016, in this paper we propose a decision tree for depression classification. The decision tree is constructed according to the distribution of the multimodal prediction of PHQ-8 scores and participants' characteristics (PTSD/Depression Diagnostic, sleep-status, feeling and personality) obtained via the analysis of the transcript files of the participants. The proposed gender specific decision tree provides a way of fusing the upper level language information with the results obtained using low level audio and visual features. Experiments are carried out on the Distress Analysis Interview Corpus - Wizard of Oz (DAIC-WOZ) database, results show that the proposed depression classification schemes obtain very promising results on the development set, with F1 score reaching 0.857 for class depressed and 0.964 for class not depressed. Despite of the over-fitting problem in training the models of predicting the PHQ-8 scores, the classification schemes still obtain satisfying performance on the test set. The Fl score reaches 0.571 for class depressed and 0.877 for class not depressed, with the average 0.724 which is higher than the baseline result 0.700.
引用
收藏
页码:89 / 96
页数:8
相关论文
共 23 条
[11]  
He L, 2015, INT CONF AFFECT, P260, DOI 10.1109/ACII.2015.7344581
[12]  
Howes Christine., 2014, Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, P7
[13]   Multimodal assistive technologies for depression diagnosis and monitoring [J].
Joshi, Jyoti ;
Goecke, Roland ;
Alghowinem, Sharifa ;
Dhall, Abhinav ;
Wagner, Michael ;
Epps, Julien ;
Parker, Gordon ;
Breakspear, Michael .
JOURNAL ON MULTIMODAL USER INTERFACES, 2013, 7 (03) :217-228
[14]   Detection of Clinical Depression in Adolescents' Speech During Family Interactions [J].
Low, Lu-Shih Alex ;
Maddage, Namunu C. ;
Lech, Margaret ;
Sheeber, Lisa B. ;
Allen, Nicholas B. .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2011, 58 (03) :574-586
[15]   INFLUENCE OF ACOUSTIC LOW-LEVEL DESCRIPTORS IN THE DETECTION OF CLINICAL DEPRESSION IN ADOLESCENTS [J].
Low, Lu-Shih Alex ;
Maddage, Namunu C. ;
Lech, Margaret ;
Sheeber, Lisa ;
Allen, Nicholas .
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, :5154-5157
[16]  
Mitra Vikramjit., 2014, PROC 4 INT WORKSHOP, P93
[17]   Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology [J].
Mundt, James C. ;
Snyder, Peter J. ;
Cannizzaro, Michael S. ;
Chappie, Kara ;
Geralts, Dayna S. .
JOURNAL OF NEUROLINGUISTICS, 2007, 20 (01) :50-64
[18]   Automatic audiovisual behavior descriptors for psychological disorder analysis [J].
Scherer, Stefan ;
Stratou, Giota ;
Lucas, Gale ;
Mahmoud, Marwa ;
Boberg, Jill ;
Gratch, Jonathan ;
Rizzo, Albert ;
Morency, Louis-Philippe .
IMAGE AND VISION COMPUTING, 2014, 32 (10) :648-658
[19]  
Senoussaoui M., 2014, P 4 INT WORKSH AUD V, P57
[20]   Automatic nonverbal behavior indicators of depression and PTSD: the effect of gender [J].
Stratou, Giota ;
Scherer, Stefan ;
Gratch, Jonathan ;
Morency, Louis-Philippe .
JOURNAL ON MULTIMODAL USER INTERFACES, 2015, 9 (01) :17-29