Speech Emotion Recognition using Context-Aware Dilated Convolution Network

被引:5
|
作者
Kakuba, Samuel [1 ]
Han, Dong Seog [2 ]
机构
[1] Kyungpook Natl Univ, Grad Sch Elect & Elect Engn, Daegu, South Korea
[2] Kyungpook Natl Univ, Sch Elect & Elect Engn, Daegu, South Korea
来源
2022 27TH ASIA PACIFIC CONFERENCE ON COMMUNICATIONS (APCC 2022): CREATING INNOVATIVE COMMUNICATION TECHNOLOGIES FOR POST-PANDEMIC ERA | 2022年
关键词
context-aware emotion recognition; multi-head attention; dilated convolution;
D O I
10.1109/APCC55198.2022.9943771
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep learning-based speech emotion recognition has been applied for social living assistance, health monitoring, authentication, and other human-to-machine interaction applications. Because of the ubiquitous nature of the applications, computationally efficient and robust speech emotion recognition models are required. The nature of the speech signal requires tracking of time steps, analyzing long-term dependencies and the contexts of the utterances as well as the spatial cues. Recurrent neural networks like long short-term memory and gated recurrent units coupled with attention mechanisms are often used to consider long-term dependencies and context in the speech signal. However, they do not take care of the spatial cues that may exist in the speech signal. Moreover, the operation of most of these systems is sequential which causes slow convergence, and sluggish training. Therefore, we propose a model that employs dilated convolutions layers in combination with hybrid attention mechanisms. The model uses multi-head attention to extract the global context in the feature representations which are fed into the bidirectional long short-term memory configured with self-attention to further handle the context and long-term dependencies. The model uses spectral and voice quality features extracted from the raw speech signals as input. The proposed model achieves comparable performance in terms of F1 score and accuracy. The proposed model's performance is also presented in terms of confusion matrices.
引用
收藏
页码:601 / 604
页数:4
相关论文
共 50 条
  • [21] Incorporating structured emotion commonsense knowledge and interpersonal relation into context-aware emotion recognition
    Chen, Jing
    Yang, Tao
    Huang, Ziqiang
    Wang, Kejun
    Liu, Meichen
    Lyu, Chunyan
    APPLIED INTELLIGENCE, 2023, 53 (04) : 4201 - 4217
  • [22] Regional Attention Networks with Context-aware Fusion for Group Emotion Recognition
    Khan, Ahmed Shehab
    Li, Zhiyuan
    Cai, Jie
    Tong, Yan
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 1149 - 1158
  • [23] DVC-Net: a new dual-view context-aware network for emotion recognition in the wild
    Qing, Linbo
    Wen, Hongqian
    Chen, Honggang
    Jin, Rulong
    Cheng, Yongqiang
    Peng, Yonghong
    NEURAL COMPUTING & APPLICATIONS, 2023, 36 (2): : 653 - 665
  • [24] DVC-Net: a new dual-view context-aware network for emotion recognition in the wild
    Linbo Qing
    Hongqian Wen
    Honggang Chen
    Rulong Jin
    Yongqiang Cheng
    Yonghong Peng
    Neural Computing and Applications, 2024, 36 : 653 - 665
  • [25] CONTEXT-AWARE NEURAL CONFIDENCE ESTIMATION FOR RARE WORD SPEECH RECOGNITION
    Qiu, David
    Munkhdalai, Tsendsuren
    He, Yanzhang
    Sim, Khe Chai
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 31 - 37
  • [26] Multi-Encoder Sequential Attention Network for Context-Aware Speech Recognition in Japanese Dialog Conversation
    Tachimori, Nobuya
    Sakti, Sakriani
    Nakamura, Satoshi
    2021 24th Conference of the Oriental COCOSDA International Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques, O-COCOSDA 2021, 2021, : 1 - 6
  • [27] Convolution neural network with multiple pooling strategies for speech emotion recognition
    Jiang, Pengxu
    Zou, Cairong
    2022 6TH INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROL, ISCSIC, 2022, : 89 - 92
  • [28] Context-aware SAR image ship detection and recognition network
    Li, Chao
    Yue, Chenke
    Li, Hanfu
    Wang, Zhile
    FRONTIERS IN NEUROROBOTICS, 2024, 18
  • [29] MULTI-ENCODER SEQUENTIAL ATTENTION NETWORK FOR CONTEXT-AWARE SPEECH RECOGNITION IN JAPANESE DIALOG CONVERSATION
    Tachimori, Nobuya
    Sakti, Sakriani
    Nakamura, Satoshi
    2021 24TH CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA), 2021, : 1 - 6
  • [30] Investigation of Context-aware System Using Activity Recognition
    Watanabe, Yuki
    Suzumura, Reiji
    Matsuno, Shogo
    Ohyama, Minoru
    2019 1ST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (ICAIIC 2019), 2019, : 287 - 291