Multimodal Embeddings From Language Models for Emotion Recognition in the Wild

被引：10

作者：

Tseng, Shao-Yen ^{[1
]}

Narayanan, Shrikanth ^{[1
]}

Georgiou, Panayiotis ^{[2
]}

机构：

[1] Univ Southern Calif, Dept Elect & Comp Engn, Los Angeles, CA 90089 USA

[2] Apple Inc, Siri Understanding, Culver City, CA 90016 USA

来源：

IEEE SIGNAL PROCESSING LETTERS | 2021年 / 28卷

关键词：

Acoustics; Task analysis; Feature extraction; Convolution; Emotion recognition; Context modeling; Bit error rate; Machine learning; unsupervised learning; natural language processing; speech processing; emotion recognition; SPEECH;

D O I：

10.1109/LSP.2021.3065598

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Word embeddings such as ELMo and BERT have been shown to model word usage in language with greater efficacy through contextualized learning on large-scale language corpora, resulting in significant performance improvement across many natural language processing tasks. In this work we integrate acoustic information into contextualized lexical embeddings through the addition of a parallel stream to the bidirectional language model. This multimodal language model is trained on spoken language data that includes both text and audio modalities. We show that embeddings extracted from this model integrate paralinguistic cues into word meanings and can provide vital affective information by applying these multimodal embeddings to the task of speaker emotion recognition.

引用

页码：608 / 612

页数：5

共 50 条

[31] Emotion Recognition From Multimodal Physiological Signals Using a Regularized Deep Fusion of Kernel Machine
Zhang, Xiaowei
Liu, Jinyong
Shen, Jian
Li, Shaojie
Hou, Kechen
Hu, Bin
Gao, Jin
Zhang, Tong
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (09) : 4386 - 4399
[32] Multimodal Emotion Recognition for Human Robot Interaction
Adiga, Sharvari
Vaishnavi, D. V.
Saxena, Suchitra
ShikhaTripathi
2020 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE (ISCMI 2020), 2020, : 197 - 203
[33] Emotion Recognition on Multimodal with Deep Learning and Ensemble
Dharma, David Adi
Zahra, Amalia
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (12) : 656 - 663
[34] Self-Supervised EEG Emotion Recognition Models Based on CNN
Wang, Xingyi
Ma, Yuliang
Cammon, Jared
Fang, Feng
Gao, Yunyuan
Zhang, Yingchun
IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2023, 31 : 1952 - 1962
[35] Learning Alignment for Multimodal Emotion Recognition from Speech
Xu, Haiyang
Zhang, Hui
Han, Kun
Wang, Yun
Peng, Yiping
Li, Xiangang
INTERSPEECH 2019, 2019, : 3569 - 3573
[36] Masked Graph Learning With Recurrent Alignment for Multimodal Emotion Recognition in Conversation
Meng, Tao
Zhang, Fuchen
Shou, Yuntao
Shao, Hongen
Ai, Wei
Li, Keqin
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4298 - 4312
[37] Emotion Recognition From Expressions in Face, Voice, and Body: The Multimodal Emotion Recognition Test (MERT)
Baenziger, Tanja
Grandjean, Didier
Scherer, Klaus R.
EMOTION, 2009, 9 (05) : 691 - 704
[38] FMFN: A Fuzzy Multimodal Fusion Network for Emotion Recognition in Ensemble Conducting
Han, Xiao
Chen, Fuyang
Ban, Junrong
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2025, 33 (01) : 168 - 179
[39] Multimodal Decoupled Distillation Graph Neural Network for Emotion Recognition in Conversation
Dai, Yijing
Li, Yingjian
Chen, Dongpeng
Li, Jinxing
Lu, Guangming
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9910 - 9924
[40] Emotion Recognition from Multimodal Physiological Signals for Emotion Aware Healthcare Systems
Ayata, Deger
Yaslan, Yusuf
Kamasak, Mustafa E.
JOURNAL OF MEDICAL AND BIOLOGICAL ENGINEERING, 2020, 40 (02) : 149 - 157

← 1 2 3 4 5 →