Speech Emotion Recognition using Context-Aware Dilated Convolution Network

被引：5

作者：

Kakuba, Samuel ^{[1
]}

Han, Dong Seog ^{[2
]}

机构：

[1] Kyungpook Natl Univ, Grad Sch Elect & Elect Engn, Daegu, South Korea

[2] Kyungpook Natl Univ, Sch Elect & Elect Engn, Daegu, South Korea

来源：

2022 27TH ASIA PACIFIC CONFERENCE ON COMMUNICATIONS (APCC 2022): CREATING INNOVATIVE COMMUNICATION TECHNOLOGIES FOR POST-PANDEMIC ERA | 2022年

关键词：

context-aware emotion recognition; multi-head attention; dilated convolution;

D O I：

10.1109/APCC55198.2022.9943771

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Deep learning-based speech emotion recognition has been applied for social living assistance, health monitoring, authentication, and other human-to-machine interaction applications. Because of the ubiquitous nature of the applications, computationally efficient and robust speech emotion recognition models are required. The nature of the speech signal requires tracking of time steps, analyzing long-term dependencies and the contexts of the utterances as well as the spatial cues. Recurrent neural networks like long short-term memory and gated recurrent units coupled with attention mechanisms are often used to consider long-term dependencies and context in the speech signal. However, they do not take care of the spatial cues that may exist in the speech signal. Moreover, the operation of most of these systems is sequential which causes slow convergence, and sluggish training. Therefore, we propose a model that employs dilated convolutions layers in combination with hybrid attention mechanisms. The model uses multi-head attention to extract the global context in the feature representations which are fed into the bidirectional long short-term memory configured with self-attention to further handle the context and long-term dependencies. The model uses spectral and voice quality features extracted from the raw speech signals as input. The proposed model achieves comparable performance in terms of F1 score and accuracy. The proposed model's performance is also presented in terms of confusion matrices.

引用

页码：601 / 604

页数：4

共 50 条

[21] Incorporating structured emotion commonsense knowledge and interpersonal relation into context-aware emotion recognition
Chen, Jing
Yang, Tao
Huang, Ziqiang
Wang, Kejun
Liu, Meichen
Lyu, Chunyan
APPLIED INTELLIGENCE, 2023, 53 (04) : 4201 - 4217
[22] Regional Attention Networks with Context-aware Fusion for Group Emotion Recognition
Khan, Ahmed Shehab
Li, Zhiyuan
Cai, Jie
Tong, Yan
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1149 - 1158
[23] DVC-Net: a new dual-view context-aware network for emotion recognition in the wild
Qing, Linbo
Wen, Hongqian
Chen, Honggang
Jin, Rulong
Cheng, Yongqiang
Peng, Yonghong
NEURAL COMPUTING & APPLICATIONS, 2023, 36 (2): : 653 - 665
[24] DVC-Net: a new dual-view context-aware network for emotion recognition in the wild
Linbo Qing
Hongqian Wen
Honggang Chen
Rulong Jin
Yongqiang Cheng
Yonghong Peng
Neural Computing and Applications, 2024, 36 : 653 - 665
[25] CONTEXT-AWARE NEURAL CONFIDENCE ESTIMATION FOR RARE WORD SPEECH RECOGNITION
Qiu, David
Munkhdalai, Tsendsuren
He, Yanzhang
Sim, Khe Chai
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 31 - 37
[26] Multi-Encoder Sequential Attention Network for Context-Aware Speech Recognition in Japanese Dialog Conversation
Tachimori, Nobuya
Sakti, Sakriani
Nakamura, Satoshi
2021 24th Conference of the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2021, 2021, : 1 - 6
[27] Convolution neural network with multiple pooling strategies for speech emotion recognition
Jiang, Pengxu
Zou, Cairong
2022 6TH INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROL, ISCSIC, 2022, : 89 - 92
[28] Context-aware SAR image ship detection and recognition network
Li, Chao
Yue, Chenke
Li, Hanfu
Wang, Zhile
FRONTIERS IN NEUROROBOTICS, 2024, 18
[29] MULTI-ENCODER SEQUENTIAL ATTENTION NETWORK FOR CONTEXT-AWARE SPEECH RECOGNITION IN JAPANESE DIALOG CONVERSATION
Tachimori, Nobuya
Sakti, Sakriani
Nakamura, Satoshi
2021 24TH CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA), 2021, : 1 - 6
[30] Investigation of Context-aware System Using Activity Recognition
Watanabe, Yuki
Suzumura, Reiji
Matsuno, Shogo
Ohyama, Minoru
2019 1ST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (ICAIIC 2019), 2019, : 287 - 291

← 1 2 3 4 5 →