A PITCH BASED NOISE ESTIMATION TECHNIQUE FOR ROBUST SPEECH RECOGNITION WITH MISSING DATA

被引：0

作者：

Morales-Cordovilla, Juan A. ^{[1
]}

Ma, Ning ^{[2
]}

Sanchez, Victoria ^{[1
]}

Carmona, Jose L. ^{[1
]}

Peinado, Antonio M. ^{[1
]}

Barker, Jon ^{[2
]}

机构：

[1] Univ Granada, Dept Teoria Senal Telemat & Comunicac, E-18071 Granada, Spain

[2] Univ Sheffield, Dept Comp Sci, Sheffield, South Yorkshire, England

来源：

2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2011年

基金：

英国工程与自然科学研究理事会;

关键词：

Robust speech recognition; missing data; noise estimation; VAD; harmonic tunnelling;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a noise estimation technique based on knowledge of pitch information for robust speech recognition. In the first stage the noise is estimated by means of extrapolating the noise from frames where speech is believed to be absent. These frames are detected with a proposed pitch based VAD (Voice Activity Detector). In the second stage the noise estimation is revised in voiced frames using harmonic tunnelling thechnique. The tunnelling noise estimation is used at high SNRs as an upper bound of the noise rather than a suitable estimation. A spectrogram MD (Missing Data) recognition system is chosen to evaluate the proposed noise estimation. The proposed system is compared in Aurora-2 with other similar techniques like cepstral SS (Spectral Subtraction).

引用

页码：4808 / 4811

页数：4

共 50 条

[31] Noise Adaptive Training for Robust Automatic Speech Recognition
Kalinli, Ozlem
Seltzer, Michael L.
Droppo, Jasha
Acero, Alex
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08): : 1889 - 1901
[32] Missing-Feature Reconstruction by Leveraging Temporal Spectral Correlation for Robust Speech Recognition in Background Noise Conditions
Kim, Wooil
Hansen, John H. L.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08): : 2111 - 2120
[33] Noise-Robust Speaker Recognition Combining Missing Data Techniques and Universal Background Modeling
May, Tobias
van de Par, Steven
Kohlrausch, Armin
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01): : 108 - 121
[34] SPEECH SEPARATION BASED ON SIGNAL-NOISE-DEPENDENT DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
Tu, Yan-Hui
Du, Jun
Dai, Li-Rong
Lee, Chin-Hui
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 61 - 65
[35] Mixed environment compensation based on maximum a posteriori estimation for robust speech recognition
Shen, Haifeng
Liu, Gang
Guo, Jun
ARTIFICIAL INTELLIGENCE REVIEW, 2009, 32 (1-4) : 1 - 11
[36] MMSE-Based Missing-Feature Reconstruction With Temporal Modeling for Robust Speech Recognition
Gonzalez, Jose A.
Peinado, Antonio M.
Ma, Ning
Gomez, Angel M.
Barker, Jon
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (03): : 624 - 635
[37] Mixed environment compensation based on maximum a posteriori estimation for robust speech recognition
Haifeng Shen
Gang Liu
Jun Guo
Artificial Intelligence Review, 2009, 32 : 1 - 11
[38] SEMI-SUPERVISED NOISE DICTIONARY ADAPTATION FOR EXEMPLAR-BASED NOISE ROBUST SPEECH RECOGNITION
Luan, Yi
Saito, Daisuke
Kashiwagi, Yosuke
Minematsu, Nobuaki
Hirose, Keikichi
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[39] Cepstral vector normalization based on stereo data for robust speech recognition
Buera, Luis
Lleida, Eduardo
Miguel, Antonio
Ortega, Alfonso
Saz, Oscar
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 1098 - 1113
[40] Sequential estimation with optimal forgetting for robust speech recognition
Afify, M
Siohan, O
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (01): : 19 - 26

← 1 2 3 4 5 →