Likelihood-Maximizing-Based Multiband Spectral Subtraction for Robust Speech Recognition

被引:0
|
作者
Bagher BabaAli
Hossein Sameti
Mehran Safayani
机构
[1] Sharif University of Technology,Department of Computer Engineering
关键词
Speech Recognition; Speech Signal; Recognition Accuracy; Automatic Speech Recognition; Speech Quality;
D O I
暂无
中图分类号
学科分类号
摘要
Automatic speech recognition performance degrades significantly when speech is affected by environmental noise. Nowadays, the major challenge is to achieve good robustness in adverse noisy conditions so that automatic speech recognizers can be used in real situations. Spectral subtraction (SS) is a well-known and effective approach; it was originally designed for improving the quality of speech signal judged by human listeners. SS techniques usually improve the quality and intelligibility of speech signal while speech recognition systems need compensation techniques to reduce mismatch between noisy speech features and clean trained acoustic model. Nevertheless, correlation can be expected between speech quality improvement and the increase in recognition accuracy. This paper proposes a novel approach for solving this problem by considering SS and the speech recognizer not as two independent entities cascaded together, but rather as two interconnected components of a single system, sharing the common goal of improved speech recognition accuracy. This will incorporate important information of the statistical models of the recognition engine as a feedback for tuning SS parameters. By using this architecture, we overcome the drawbacks of previously proposed methods and achieve better recognition accuracy. Experimental evaluations show that the proposed method can achieve significant improvement of recognition rates across a wide range of signal to noise ratios.
引用
收藏
相关论文
共 50 条
  • [31] Subband likelihood-maximizing beamforming for speech recognition in reverberant environments
    Seltzer, Michael L.
    Stern, Richard M.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (06): : 2109 - 2121
  • [32] Maximum likelihood polynomial regression for robust speech recognition
    Lü, Yong
    Wu, Zhenyang
    Shengxue Xuebao/Acta Acustica, 2010, 35 (01): : 88 - 96
  • [33] Maximum likelihood polynomial regression for robust speech recognition
    L Yong WU Zhenyang (School of Information Science and Engineering
    ChineseJournalofAcoustics, 2011, 30 (03) : 358 - 370
  • [34] The use of phase in complex spectrum subtraction for robust speech recognition
    Kleinschmidt, Tristan
    Sridharan, Sridha
    Mason, Michael
    COMPUTER SPEECH AND LANGUAGE, 2011, 25 (03): : 585 - 600
  • [35] Speech recognition with wavelet spectral subtraction in real noisy environment
    Denda, N
    Nishiura, T
    Kawahara, H
    Irino, T
    2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 638 - 641
  • [36] Application of Improved Spectral Subtraction Algorithm for Speech Emotion Recognition
    Zhang Wanli
    Li Guoxin
    Wang Lirong
    PROCEEDINGS 2015 IEEE FIFTH INTERNATIONAL CONFERENCE ON BIG DATA AND CLOUD COMPUTING BDCLOUD 2015, 2015, : 213 - 216
  • [37] CROSS-CHANNEL SPECTRAL SUBTRACTION FOR MEETING SPEECH RECOGNITION
    Nasu, Yu
    Shinoda, Koichi
    Furui, Sadaoki
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4812 - 4815
  • [38] Speech Recognition by Denoising and Dereverberation Based on Spectral Subtraction in a Real Noisy Reverberant Environment
    Odani, Kyohei
    Wang, Longbiao
    Kai, Atsuhiko
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1250 - 1253
  • [39] Spectral Subtraction for Reverberation Reduction Applied to Automatic Speech Recognition
    Pacheco, Fernando S.
    Seara, Rui
    PROCEEDINGS OF THE IEEE INTERNATIONAL TELECOMMUNICATIONS SYMPOSIUM, VOLS 1 AND 2, 2006, : 795 - 800
  • [40] An Octave-Scale Multiband Spectral Subtraction Noise Reduction Method for Speech Enhancement
    Bhattacharya, P. C.
    Tangsangiumvisai, N.
    2017 IEEE 2ND INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP), 2017, : 346 - 350