A Saliency-based Attention LSTM Model for Cognitive Load Classification from Speech

被引:10
作者
Gallardo-Antolin, Ascension [1 ]
Montero, Juan M. [2 ]
机构
[1] Univ Carlos III Madrid, Dept Signal Theory & Commun, Madrid, Spain
[2] Univ Politecn Madrid, ETSIT, Speech Technol Grp, Madrid, Spain
来源
INTERSPEECH 2019 | 2019年
关键词
cognitive load; speech; LSTM; weigthed pooling; auditory saliency; attention model;
D O I
10.21437/Interspeech.2019-1603
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Cognitive Load (CL) refers to the amount of mental demand that a given task imposes on an individual's cognitive system and it can affect his/her productivity in very high load situations. In this paper, we propose an automatic system capable of classifying the CL level of a speaker by analyzing his/her voice. Our research on this topic goes into two main directions. In the first one, we focus on the use of Long Short-Term Memory (LSTM) networks with different weighted pooling strategies for CL level classification. In the second contribution, for overcoming the need of a large amount of training data, we propose a novel attention mechanism that uses the Kalinli's auditory saliency model. Experiments show that our proposal outperforms significantly both, a baseline system based on Support Vector Machines (SVM) and a LSTM-based system with logistic regression attention model.
引用
收藏
页码:216 / 220
页数:5
相关论文
共 32 条
[1]  
[Anonymous], 2012, THESIS U NEW S WALES
[2]  
[Anonymous], 2014, INTERSPEECH 2014 15
[3]  
Berthold A, 1999, CISM COURSES LECT, P235
[4]  
Boril H, 2010, 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, P502
[5]  
Brian M., 2015, P 14 PYTHON SCI C, P18, DOI [DOI 10.25080/MAJORA-7B98E3ED-003, 10. 25080/Majora-7b98e3ed-003]
[6]  
Chollet F, 2018, KERAS PYTHON DEEP LE
[7]  
Chorowski J, 2015, ADV NEUR IN, V28
[8]  
Ciresan D, 2012, PROC CVPR IEEE, P3642, DOI 10.1109/CVPR.2012.6248110
[9]  
Eyben F., 2013, P 21 ACM INT C MULT, P835
[10]   Learning precise timing with LSTM recurrent networks [J].
Gers, FA ;
Schraudolph, NN ;
Schmidhuber, J .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (01) :115-143