Detecting Deception from Gaze and Speech Using a Multimodal Attention LSTM-Based Framework

被引:17
作者
Gallardo-Antolin, Ascension [1 ]
Montero, Juan M. [2 ]
机构
[1] Univ Carlos III Madrid, Dept Signal Theory & Commun, Avda Univ 30, Madrid 28911, Spain
[2] Univ Politecn Madrid, ETSIT, Speech Technol Grp, Avda Complutense 30, Madrid 28040, Spain
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 14期
关键词
deception detection; multimodal; gaze; speech; LSTM; attention; fusion; SYSTEM;
D O I
10.3390/app11146393
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The automatic detection of deceptive behaviors has recently attracted the attention of the research community due to the variety of areas where it can play a crucial role, such as security or criminology. This work is focused on the development of an automatic deception detection system based on gaze and speech features. The first contribution of our research on this topic is the use of attention Long Short-Term Memory (LSTM) networks for single-modal systems with frame-level features as input. In the second contribution, we propose a multimodal system that combines the gaze and speech modalities into the LSTM architecture using two different combination strategies: Late Fusion and Attention-Pooling Fusion. The proposed models are evaluated over the Bag-of-Lies dataset, a multimodal database recorded in real conditions. On the one hand, results show that attentional LSTM networks are able to adequately model the gaze and speech feature sequences, outperforming a reference Support Vector Machine (SVM)-based system with compact features. On the other hand, both combination strategies produce better results than the single-modal systems and the multimodal reference system, suggesting that gaze and speech modalities carry complementary information for the task of deception detection that can be effectively exploited by using LSTMs.
引用
收藏
页数:16
相关论文
共 44 条
[11]  
Efthymiou A.E., 2019, THESIS U AMSTERDAM
[12]   An attention Long Short-Term Memory based system for automatic classification of speech intelligibility [J].
Fernandez-Diaz, Miguel ;
Gallardo-Antolin, Ascension .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 96
[13]   Eye blinks: new indices for the detection of deception [J].
Fukuda, K .
INTERNATIONAL JOURNAL OF PSYCHOPHYSIOLOGY, 2001, 40 (03) :239-245
[14]  
Gallardo-Antolin Ascension, 2019, Statistical Language and Speech Processing. 7th International Conference, SLSP 2019. Proceedings: Lecture Notes in Artificial Intelligence (LNAI 11816), P139, DOI 10.1007/978-3-030-31372-2_12
[15]   On combining acoustic and modulation spectrograms in an attention LSTM-based system for speech intelligibility level classification [J].
Gallardo-Antolin, Ascension ;
Montero, Juan M. .
NEUROCOMPUTING, 2021, 456 :49-60
[16]   A Saliency-based Attention LSTM Model for Cognitive Load Classification from Speech [J].
Gallardo-Antolin, Ascension ;
Montero, Juan M. .
INTERSPEECH 2019, 2019, :216-220
[17]   Learning precise timing with LSTM recurrent networks [J].
Gers, FA ;
Schraudolph, NN ;
Schmidhuber, J .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (01) :115-143
[18]   Parkinson's Disease Detection from Drawing Movements Using Convolutional Neural Networks [J].
Gil-Martin, Manuel ;
Montero, Juan Manuel ;
San-Segundo, Ruben .
ELECTRONICS, 2019, 8 (08)
[19]   Attention based CLDNNs for short-duration acoustic scene classification [J].
Guo, Jinxi ;
Xu, Ning ;
Li, Li-Jia ;
Alwan, Abeer .
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, :469-473
[20]   Bag-of-Lies: A Multimodal Dataset for Deception Detection [J].
Gupta, Viresh ;
Agarwal, Mohit ;
Arora, Manik ;
Chakraborty, Tanmoy ;
Singh, Richa ;
Vatsa, Mayank .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, :83-90