Decision Tree Based Depression Classification from Audio Video and Language Information

被引:89
作者
Yang, Le [1 ]
Jiang, Dongmei [1 ]
He, Lang [1 ]
Pei, Ercheng [1 ]
Oveneke, Meshia Cedric [2 ]
Sahli, Hichem [3 ,4 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, NPU VUB Joint AVSP Lab, 127 Youyi Xilu, Xian 710072, Peoples R China
[2] Vrije Univ Brussel, ETRO, Dept Elect & Informat, NPU VUB Joint AVSP Lab, Pl Laan 2, B-1050 Brussels, Belgium
[3] VUB, Dept ETRO, NPU VUB Joint AVSP Lab, Pl Laan 2, B-1050 Brussels, Belgium
[4] Interuniv Microelect Ctr, Kepeldreef 75, B-3001 Heverlee, Belgium
来源
PROCEEDINGS OF THE 6TH INTERNATIONAL WORKSHOP ON AUDIO/VISUAL EMOTION CHALLENGE (AVEC'16) | 2016年
关键词
Depression classification; decision tree; multi-modal;
D O I
10.1145/2988257.2988269
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In order to improve the recognition accuracy of the Depression Classification Sub-Challenge (DCC) of the AVEC 2016, in this paper we propose a decision tree for depression classification. The decision tree is constructed according to the distribution of the multimodal prediction of PHQ-8 scores and participants' characteristics (PTSD/Depression Diagnostic, sleep-status, feeling and personality) obtained via the analysis of the transcript files of the participants. The proposed gender specific decision tree provides a way of fusing the upper level language information with the results obtained using low level audio and visual features. Experiments are carried out on the Distress Analysis Interview Corpus - Wizard of Oz (DAIC-WOZ) database, results show that the proposed depression classification schemes obtain very promising results on the development set, with F1 score reaching 0.857 for class depressed and 0.964 for class not depressed. Despite of the over-fitting problem in training the models of predicting the PHQ-8 scores, the classification schemes still obtain satisfying performance on the test set. The Fl score reaches 0.571 for class depressed and 0.877 for class not depressed, with the average 0.724 which is higher than the baseline result 0.700.
引用
收藏
页码:89 / 96
页数:8
相关论文
共 23 条
[1]   Head Pose and Movement Analysis as an Indicator of Depression [J].
Alghowinem, Sharifa ;
Goecke, Roland ;
Wagner, Michael ;
Parker, Gordon ;
Breakspear, Michael .
2013 HUMAINE ASSOCIATION CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2013, :283-288
[2]   From Joyous to Clinically Depressed: Mood Detection using Multimodal Analysis of a Person's Appearance and Speech [J].
Alghowinem, Sharifa .
2013 HUMAINE ASSOCIATION CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2013, :648-653
[3]  
[Anonymous], 2016, P IEEE WINT C APPL C
[4]  
[Anonymous], P MLSP REIMS FRANC
[5]  
[Anonymous], 2016, P 6 INT WORKSH AUD V
[6]  
Cummins N., 2013, P 3 ACM INT WORKSHOP, P11
[7]  
Cummins N, 2011, 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, P3008
[8]  
Gamon M., 2013, AAAI
[9]   Nonverbal social withdrawal in depression: Evidence from manual and automatic analyses [J].
Girard, Jeffrey M. ;
Cohn, Jeffrey F. ;
Mahoor, Mohammad H. ;
Mavadati, S. Mohammad ;
Hammal, Zakia ;
Rosenwald, Dean P. .
IMAGE AND VISION COMPUTING, 2014, 32 (10) :641-647
[10]  
Gratch J, 2014, LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P3123