Front-End Feature Compensation for Noise Robust Speech Emotion Recognition

被引:1
|
作者
Pandharipande, Meghna [1 ]
Chakraborty, Rupayan [1 ]
Panda, Ashish [1 ]
Das, Biswajit [1 ]
Kopparapu, Sunil Kumar [1 ]
机构
[1] TCS Res & Innovat Mumbai, Yantra Pk, Thana 400601, Maharashtra, India
关键词
Emotion recognition; Noisy speech; Feature compensation; Auditory masking; Vector Taylor Series;
D O I
10.23919/eusipco.2019.8902981
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Robust feature compensation and selection are important aspects of noisy speech emotion recognition (SER) task, especially in mismatched condition, when the models are trained on clean speech and tested in the noisy scenarios. Here we propose the use of front-end feature compensation techniques based on Vector Taylor Series (VTS) expansion and VTS with auditory masking (VTS-AM) to improve the performance of SER systems. On top of VTS and VTS-AM, we compare the performances of log-compression and root-compression to the mel-filter-bank energies. Further, we demonstrate the benefit of feature selection applied to the non-MFCC high-level descriptors in conjunction with VTS, VTS-AM and root compression. The system performance is compared with popular Non-negative Matrix Factorization (NMF) based enhancement and energy based voice activity detector (VAD) technique, which discards silence or noisy frames in the spoken utterances. To demonstrate the efficacy of our proposed techniques, extensive experiments are conducted on 2 standard datasets (EmoDB and IEMOCAP), contaminated with 5 types of noise (Babble, F-16, Factory, Volvo, and HF-channel) from the Noisex-92 noise database at 5 SNR levels (0dB, 5dB, 10dB, 15dB and 20dB).
引用
收藏
页数:5
相关论文
共 50 条
  • [31] A biological front-end processing for speech recognition
    Ferrandez, JM
    del Valle, D
    Rodellar, V
    Gomez, P
    BIOLOGICAL AND ARTIFICIAL COMPUTATION: FROM NEUROSCIENCE TO TECHNOLOGY, 1997, 1240 : 1058 - 1067
  • [32] Noise Robust Speech Recognition Based on Noise-Adapted HMMs Using Speech Feature Compensation
    Chung, Yong-Joo
    2013 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE APPLICATIONS AND TECHNOLOGIES (ACSAT), 2014, : 132 - 135
  • [33] NOISE ADAPTIVE FRONT-END NORMALIZATION BASED ON VECTOR TAYLOR SERIES FOR DEEP NEURAL NETWORKS IN ROBUST SPEECH RECOGNITION
    Bo Li
    Chai, Khe Sim
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7408 - 7412
  • [34] A Unified Front-end Anti-interference Approach for Robust Automatic Speech Recognition
    Liang, Yunming
    Zhou, Yi
    Ma, Yongbao
    Liu, Hongqing
    2019 IEEE 19TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2019), 2019,
  • [35] Robust connected digit recognition using speech enhancement and an auditory model front-end
    Flynn, Ronan
    Jones, Edward
    2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, VOLS 1-4, 2007, : 410 - +
  • [36] Comparing Front-End Enhancement Techniques and Multiconditioned Training for Robust Automatic Speech Recognition
    Soni, Meet H.
    Joshi, Sonal
    Panda, Ashish
    TEXT, SPEECH, AND DIALOGUE (TSD 2019), 2019, 11697 : 329 - 340
  • [37] A new approach to variable frame rate front-end processing for robust speech recognition
    Epps, J
    ISSPA 2005: The 8th International Symposium on Signal Processing and its Applications, Vols 1 and 2, Proceedings, 2005, : 723 - 726
  • [38] Multi-microphone noise reduction techniques as front-end devices for speech recognition
    Bitzer, J
    Simmer, KU
    Kammeyer, KD
    SPEECH COMMUNICATION, 2001, 34 (1-2) : 3 - 12
  • [39] INTERACTIVE FEATURE FUSION FOR END-TO-END NOISE-ROBUST SPEECH RECOGNITION
    Hu, Yuchen
    Hou, Nana
    Chen, Chen
    Chng, Eng Siong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6292 - 6296
  • [40] Automatic Speech Recognition with a Cochlear Implant Front-End
    Nogueira, Waldo
    Harczos, Tamas
    Edler, Bernd
    Ostermann, Joern
    Buechner, Andreas
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1993 - +