Likelihood-Maximizing-Based Multiband Spectral Subtraction for Robust Speech Recognition

被引:0
|
作者
Bagher BabaAli
Hossein Sameti
Mehran Safayani
机构
[1] Sharif University of Technology,Department of Computer Engineering
关键词
Speech Recognition; Speech Signal; Recognition Accuracy; Automatic Speech Recognition; Speech Quality;
D O I
暂无
中图分类号
学科分类号
摘要
Automatic speech recognition performance degrades significantly when speech is affected by environmental noise. Nowadays, the major challenge is to achieve good robustness in adverse noisy conditions so that automatic speech recognizers can be used in real situations. Spectral subtraction (SS) is a well-known and effective approach; it was originally designed for improving the quality of speech signal judged by human listeners. SS techniques usually improve the quality and intelligibility of speech signal while speech recognition systems need compensation techniques to reduce mismatch between noisy speech features and clean trained acoustic model. Nevertheless, correlation can be expected between speech quality improvement and the increase in recognition accuracy. This paper proposes a novel approach for solving this problem by considering SS and the speech recognizer not as two independent entities cascaded together, but rather as two interconnected components of a single system, sharing the common goal of improved speech recognition accuracy. This will incorporate important information of the statistical models of the recognition engine as a feedback for tuning SS parameters. By using this architecture, we overcome the drawbacks of previously proposed methods and achieve better recognition accuracy. Experimental evaluations show that the proposed method can achieve significant improvement of recognition rates across a wide range of signal to noise ratios.
引用
收藏
相关论文
共 50 条
  • [1] Likelihood-Maximizing-Based Multiband Spectral Subtraction for Robust Speech Recognition
    BabaAli, Bagher
    Sameti, Hossein
    Safayani, Mehran
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2009,
  • [2] Spectral Subtraction in Likelihood-Maximizing Framework for Robust Speech Recognition
    BabaAli, Bagher
    Sameti, Hossein
    Safayani, Mehran
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 980 - +
  • [3] Spectral subtraction in model distance maximizing framework for robust speech recognition
    BabaAli, Bagher
    Sameti, Hossein
    Safayani, Mehran
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 627 - +
  • [4] Q-Gaussian based spectral subtraction for robust speech recognition
    Pardede, Hilman E.
    Shinoda, Koichi
    Iwano, Koji
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1254 - 1257
  • [5] Robust Speech Recognition Based on Multi-band Spectral Subtraction
    Wan, Yi-Long
    Zhang, Tian-Qi
    Wang, Zhi-Chao
    Jin, Jing
    2013 6TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP), VOLS 1-3, 2013, : 36 - 40
  • [6] Estimation of Spectral Subtraction Parameter-set for Maximizing Speech Recognition Performance
    Kubo, Shintaro
    Miyazaki, Ryoichi
    2016 IEEE 5TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS, 2016,
  • [7] Spectral subtraction using spectral harmonics for robust speech recognition in car environments
    Beh, J
    Ko, H
    COMPUTATIONAL SCIENCE - ICCS 2003, PT IV, PROCEEDINGS, 2003, 2660 : 1109 - 1116
  • [8] Speech Enhancement Based on Spectral Subtraction for Speech Recognition System
    Han, Jung-woo
    Kim, Se-young
    Kim, Ki-man
    Jung, Ji-won
    Yun, Young
    IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE 2011), 2011, : 417 - 418
  • [9] Noise robust speech recognition based on fractional spectral subtraction and perceptual linear prediction
    Postdoctoral Station, Nanjing University of International Relations, Nanjing 210039, China
    不详
    Nanjing Youdian Daxue Xuebao (Ziran Kexue Ban), 2008, 4 (12-15+21):
  • [10] Likelihood-maximizing beamforming for robust hands-free speech recognition
    Seltzer, ML
    Raj, B
    Stern, RM
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (05): : 489 - 498