Front-End Feature Compensation for Noise Robust Speech Emotion Recognition

被引：1

作者：

Pandharipande, Meghna ^{[1
]}

Chakraborty, Rupayan ^{[1
]}

Panda, Ashish ^{[1
]}

Das, Biswajit ^{[1
]}

Kopparapu, Sunil Kumar ^{[1
]}

机构：

[1] TCS Res & Innovat Mumbai, Yantra Pk, Thana 400601, Maharashtra, India

来源：

2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2019年

关键词：

Emotion recognition; Noisy speech; Feature compensation; Auditory masking; Vector Taylor Series;

D O I：

10.23919/eusipco.2019.8902981

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Robust feature compensation and selection are important aspects of noisy speech emotion recognition (SER) task, especially in mismatched condition, when the models are trained on clean speech and tested in the noisy scenarios. Here we propose the use of front-end feature compensation techniques based on Vector Taylor Series (VTS) expansion and VTS with auditory masking (VTS-AM) to improve the performance of SER systems. On top of VTS and VTS-AM, we compare the performances of log-compression and root-compression to the mel-filter-bank energies. Further, we demonstrate the benefit of feature selection applied to the non-MFCC high-level descriptors in conjunction with VTS, VTS-AM and root compression. The system performance is compared with popular Non-negative Matrix Factorization (NMF) based enhancement and energy based voice activity detector (VAD) technique, which discards silence or noisy frames in the spoken utterances. To demonstrate the efficacy of our proposed techniques, extensive experiments are conducted on 2 standard datasets (EmoDB and IEMOCAP), contaminated with 5 types of noise (Babble, F-16, Factory, Volvo, and HF-channel) from the Noisex-92 noise database at 5 SNR levels (0dB, 5dB, 10dB, 15dB and 20dB).

引用

页数：5

共 50 条

[31] A biological front-end processing for speech recognition
Ferrandez, JM
del Valle, D
Rodellar, V
Gomez, P
BIOLOGICAL AND ARTIFICIAL COMPUTATION: FROM NEUROSCIENCE TO TECHNOLOGY, 1997, 1240 : 1058 - 1067
[32] Noise Robust Speech Recognition Based on Noise-Adapted HMMs Using Speech Feature Compensation
Chung, Yong-Joo
2013 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE APPLICATIONS AND TECHNOLOGIES (ACSAT), 2014, : 132 - 135
[33] NOISE ADAPTIVE FRONT-END NORMALIZATION BASED ON VECTOR TAYLOR SERIES FOR DEEP NEURAL NETWORKS IN ROBUST SPEECH RECOGNITION
Bo Li
Chai, Khe Sim
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7408 - 7412
[34] A Unified Front-end Anti-interference Approach for Robust Automatic Speech Recognition
Liang, Yunming
Zhou, Yi
Ma, Yongbao
Liu, Hongqing
2019 IEEE 19TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2019), 2019,
[35] Robust connected digit recognition using speech enhancement and an auditory model front-end
Flynn, Ronan
Jones, Edward
2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, VOLS 1-4, 2007, : 410 - +
[36] Comparing Front-End Enhancement Techniques and Multiconditioned Training for Robust Automatic Speech Recognition
Soni, Meet H.
Joshi, Sonal
Panda, Ashish
TEXT, SPEECH, AND DIALOGUE (TSD 2019), 2019, 11697 : 329 - 340
[37] A new approach to variable frame rate front-end processing for robust speech recognition
Epps, J
ISSPA 2005: The 8th International Symposium on Signal Processing and its Applications, Vols 1 and 2, Proceedings, 2005, : 723 - 726
[38] Multi-microphone noise reduction techniques as front-end devices for speech recognition
Bitzer, J
Simmer, KU
Kammeyer, KD
SPEECH COMMUNICATION, 2001, 34 (1-2) : 3 - 12
[39] INTERACTIVE FEATURE FUSION FOR END-TO-END NOISE-ROBUST SPEECH RECOGNITION
Hu, Yuchen
Hou, Nana
Chen, Chen
Chng, Eng Siong
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6292 - 6296
[40] Automatic Speech Recognition with a Cochlear Implant Front-End
Nogueira, Waldo
Harczos, Tamas
Edler, Bernd
Ostermann, Joern
Buechner, Andreas
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1993 - +

← 1 2 3 4 5 →