A PITCH BASED NOISE ESTIMATION TECHNIQUE FOR ROBUST SPEECH RECOGNITION WITH MISSING DATA

被引:0
作者
Morales-Cordovilla, Juan A. [1 ]
Ma, Ning [2 ]
Sanchez, Victoria [1 ]
Carmona, Jose L. [1 ]
Peinado, Antonio M. [1 ]
Barker, Jon [2 ]
机构
[1] Univ Granada, Dept Teoria Senal Telemat & Comunicac, E-18071 Granada, Spain
[2] Univ Sheffield, Dept Comp Sci, Sheffield, South Yorkshire, England
来源
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2011年
基金
英国工程与自然科学研究理事会;
关键词
Robust speech recognition; missing data; noise estimation; VAD; harmonic tunnelling;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a noise estimation technique based on knowledge of pitch information for robust speech recognition. In the first stage the noise is estimated by means of extrapolating the noise from frames where speech is believed to be absent. These frames are detected with a proposed pitch based VAD (Voice Activity Detector). In the second stage the noise estimation is revised in voiced frames using harmonic tunnelling thechnique. The tunnelling noise estimation is used at high SNRs as an upper bound of the noise rather than a suitable estimation. A spectrogram MD (Missing Data) recognition system is chosen to evaluate the proposed noise estimation. The proposed system is compared in Aurora-2 with other similar techniques like cepstral SS (Spectral Subtraction).
引用
收藏
页码:4808 / 4811
页数:4
相关论文
共 50 条
  • [31] Noise Adaptive Training for Robust Automatic Speech Recognition
    Kalinli, Ozlem
    Seltzer, Michael L.
    Droppo, Jasha
    Acero, Alex
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08): : 1889 - 1901
  • [32] Missing-Feature Reconstruction by Leveraging Temporal Spectral Correlation for Robust Speech Recognition in Background Noise Conditions
    Kim, Wooil
    Hansen, John H. L.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08): : 2111 - 2120
  • [33] Noise-Robust Speaker Recognition Combining Missing Data Techniques and Universal Background Modeling
    May, Tobias
    van de Par, Steven
    Kohlrausch, Armin
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01): : 108 - 121
  • [34] SPEECH SEPARATION BASED ON SIGNAL-NOISE-DEPENDENT DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Tu, Yan-Hui
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 61 - 65
  • [35] Mixed environment compensation based on maximum a posteriori estimation for robust speech recognition
    Shen, Haifeng
    Liu, Gang
    Guo, Jun
    ARTIFICIAL INTELLIGENCE REVIEW, 2009, 32 (1-4) : 1 - 11
  • [36] MMSE-Based Missing-Feature Reconstruction With Temporal Modeling for Robust Speech Recognition
    Gonzalez, Jose A.
    Peinado, Antonio M.
    Ma, Ning
    Gomez, Angel M.
    Barker, Jon
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (03): : 624 - 635
  • [37] Mixed environment compensation based on maximum a posteriori estimation for robust speech recognition
    Haifeng Shen
    Gang Liu
    Jun Guo
    Artificial Intelligence Review, 2009, 32 : 1 - 11
  • [38] SEMI-SUPERVISED NOISE DICTIONARY ADAPTATION FOR EXEMPLAR-BASED NOISE ROBUST SPEECH RECOGNITION
    Luan, Yi
    Saito, Daisuke
    Kashiwagi, Yosuke
    Minematsu, Nobuaki
    Hirose, Keikichi
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [39] Cepstral vector normalization based on stereo data for robust speech recognition
    Buera, Luis
    Lleida, Eduardo
    Miguel, Antonio
    Ortega, Alfonso
    Saz, Oscar
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 1098 - 1113
  • [40] Sequential estimation with optimal forgetting for robust speech recognition
    Afify, M
    Siohan, O
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (01): : 19 - 26