Multimodal Emotion Recognition Using Compressed Graph Neural Networks

被引:0
作者
Durkic, Tijana [1 ]
Simic, Nikola [1 ]
Bajovic, Sinisa Suzie Dragana [1 ]
Peric, Zoran [2 ]
Delic, Vladan [1 ]
机构
[1] Univ Novi Sad, Fac Tech Sci, Novi Sad 21000, Serbia
[2] Univ Nis, Fac Elect Engn, Nish 18000, Serbia
来源
SPEECH AND COMPUTER, SPECOM 2024, PT II | 2025年 / 15300卷
关键词
Graph Neural Networks; Emotion Recognition; Multimodal Data; Compression;
D O I
10.1007/978-3-031-78014-1_9
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Since electronic devices have become an integral part of life, there has been a need to bring the communication between a human and a machine closer to being as similar as possible to that between two people. As interpersonal relationships are built on the basis of feelings and empathy, training machines to understand emotions and to provide responses in accordance with the emotional state of the user, i.e. human, has become an interesting area for technology development. To gain a more comprehensive understanding of a person's emotional state, simultaneous utilization of different modalities such as audio, text, and video and their further processing using a graph neural network, recently became popular due to its suitability for tracking a conversation. However, small IoT devices commonly have constrained computational capabilities, memory resources and lower power consumption, and running such a complex multimodal algorithm in real-time may be difficult. In this research, we examine utilization of binarization and 8-bit floating point arithmetic for compressing state-of-the-art GNN-based model COGMEN. We demonstrate that in the case of the multimodal emotion recognition task, such constrained models can provide significant data savings while maintaining relatively high performance, as shown through experiments processing data from the IEMOCAP dataset.
引用
收藏
页码:109 / 121
页数:13
相关论文
共 31 条
[1]   Binary neural network based real time emotion detection on an edge computing device to detect passenger anomaly [J].
Ajay, B. S. ;
Rao, Madhav .
2021 34TH INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2021 20TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (VLSID & ES 2021), 2021, :175-180
[2]  
[Anonymous], 2019, P 2019 C EMP METH NA, P154
[3]  
Bajovic D., 2021, P 2021 INT BALK C CO, P143
[4]   IEMOCAP: interactive emotional dyadic motion capture database [J].
Busso, Carlos ;
Bulut, Murtaza ;
Lee, Chi-Chun ;
Kazemzadeh, Abe ;
Mower, Emily ;
Kim, Samuel ;
Chang, Jeannette N. ;
Lee, Sungbok ;
Narayanan, Shrikanth S. .
LANGUAGE RESOURCES AND EVALUATION, 2008, 42 (04) :335-359
[5]   Universal Deep Neural Network Compression [J].
Choi, Yoojin ;
El-Khamy, Mostafa ;
Lee, Jungwon .
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (04) :715-726
[6]  
Courbariaux M, 2016, Arxiv, DOI arXiv:1602.02830
[7]   Emotion recognition in human-computer interaction [J].
Cowie, R ;
Douglas-Cowie, E ;
Tsapatsoulis, N ;
Votsis, G ;
Kollias, S ;
Fellenz, W ;
Taylor, JG .
IEEE SIGNAL PROCESSING MAGAZINE, 2001, 18 (01) :32-80
[8]   Speech Technology Progress Based on New Machine Learning Paradigm [J].
Delic, Vlado ;
Peric, Zoran ;
Secujski, Milan ;
Jakovuevic, Niksa ;
Nikolic, Jelena ;
Miskovic, Dragisa ;
Simic, Nikola ;
Suzic, Sinisa ;
Delic, Tijana .
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2019, 2019
[9]   EMOTIONS AS SOCIAL RELATIONSHIPS [J].
DERIVERA, J ;
GRINKIS, C .
MOTIVATION AND EMOTION, 1986, 10 (04) :351-369
[10]  
Frijda NH., 1986, EMOTIONS