Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition

被引:5
|
作者
Li, Xingfeng [1 ]
Shi, Xiaohan [2 ]
Hu, Desheng [3 ]
Li, Yongwei [4 ]
Zhang, Qingchen [1 ]
Wang, Zhengxia [5 ]
Unoki, Masashi [6 ]
Akagi, Masato [6 ]
机构
[1] Hainan Univ, Grad Sch Comp Sci & Technol, Haikou 570288, Peoples R China
[2] Nagoya Univ, Sch Informat Sci, Nagoya 4648601, Japan
[3] Taiyuan Univ Technol, Coll Informat & Comp, Taiyuan 030024, Peoples R China
[4] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
[5] Hainan Univ, Sch Comp Sci & Technol, Haikou 570288, Peoples R China
[6] Japan Adv Inst Sci & Technol, Sch Informat Sci, Nomi 9231292, Japan
基金
中国国家自然科学基金;
关键词
Affective computing; speech emotion recognition; acoustic representation; music theory and speech analysis; PERCEPTION; EXPRESSION; PATTERNS; FEATURES; PITCH; PERSPECTIVE; MODALITIES; KNOWLEDGE; INTERVALS; COGNITION;
D O I
10.1109/TASLP.2023.3289312
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This research presents a music theory-inspired acoustic representation (hereafter, MTAR) to address improved speech emotion recognition. The recognition of emotion in speech and music is developed in parallel, yet a relatively limited understanding of MTAR for interpreting speech emotions is involved. In the present study, we use music theory to study representative acoustics associated with emotion in speech from vocal emotion expressions and auditory emotion perception domains. In experiments assessing the role and effectiveness of the proposed representation in classifying discrete emotion categories and predicting continuous emotion dimensions, it shows promising performance compared with extensively used features for emotion recognition based on the spectrogram, Mel-spectrogram, Mel-frequency cepstral coefficients, VGGish, and the large baseline feature sets of the INTERSPEECH challenges. This proposal opens up a novel research avenue in developing a computational acoustic representation of speech emotion via music theory.
引用
收藏
页码:2534 / 2547
页数:14
相关论文
共 50 条
  • [1] BIOLOGICALLY INSPIRED SPEECH EMOTION RECOGNITION
    Lotjidereshgi, Reza
    Gournay, Philippe
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5135 - 5139
  • [2] Feature representation for speech emotion Recognition
    Abdollahpour, Mehdi
    Zamani, Lafar
    Rad, Hamidreza Saligheh
    2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 1465 - 1468
  • [3] Vector learning representation for generalized speech emotion recognition
    Singkul, Sattaya
    Woraratpanya, Kuntpong
    HELIYON, 2022, 8 (03)
  • [4] Survey of Deep Representation Learning for Speech Emotion Recognition
    Latif, Siddique
    Rana, Rajib
    Khalifa, Sara
    Jurdak, Raja
    Qadir, Junaid
    Schuller, Bjorn
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (02) : 1634 - 1654
  • [5] Survey on bimodal speech emotion recognition from acoustic and linguistic information fusion
    Atmaja, Bagus Tris
    Sasou, Akira
    Akagi, Masato
    SPEECH COMMUNICATION, 2022, 140 : 11 - 28
  • [6] Audio Features for Music Emotion Recognition: A Survey
    Panda, Renato
    Malheiro, Ricardo
    Paiva, Rui Pedro
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (01) : 68 - 88
  • [7] Speech Emotion Recognition Based on Sparse Representation
    Yan, Jingjie
    Wang, Xiaolan
    Gu, Weiyi
    Ma, Lili
    ARCHIVES OF ACOUSTICS, 2013, 38 (04) : 465 - 470
  • [8] A deep interpretable representation learning method for speech emotion recognition
    Jing, Erkang
    Liu, Yezheng
    Chai, Yidong
    Sun, Jianshan
    Samtani, Sagar
    Jiang, Yuanchun
    Qian, Yang
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (06)
  • [9] The time course of emotion recognition in speech and music
    Nordstrom, Henrik
    Laukka, Petri
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 145 (05) : 3058 - 3074
  • [10] Novel acoustic features for speech emotion recognition
    ROH Yong-Wan
    KIM Dong-Ju
    LEE Woo-Seok
    HONG Kwang-Seok
    Science in China(Series E:Technological Sciences), 2009, (07) : 1838 - 1848