HMM-based mask estimation for a speech recognition front-end using computational auditory scene analysis

被引:0
|
作者
Park, Ji Hun [1 ]
Yoon, Jae Sam [1 ]
Kim, Hong Kook [1 ]
机构
[1] GIST, Dept Informat & Commun, Kwangju 500712, South Korea
来源
2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS | 2008年
关键词
computational auditory scene analysis; mask estimation; hidden Markov model; speech recognition; noise robustness;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a new mask estimation method for the computational auditory scene analysis (CASA) of speech using two microphones. The proposed method is based on a hidden Markov model (HMM) in order to incorporate an observation that the mask information should be correlated over contiguous analysis frames. In other words, HMM is used to estimate the mask information represented as the interaural time difference (ITD) and the interaural level difference (ILD) of two channel signals, and the estimated mask information is finally employed in the separation of desired speech from noisy speech. To show the effectiveness of the proposed mask estimation, we then compare the performance of the proposed method with that of a Gaussian kernel-based estimation method in terms of the performance of speech recognition. As a result, the proposed HMM-based mask estimation method provided an average word error rate reduction of 69.14% when compared with the Gaussian kernel-based mask estimation method.
引用
收藏
页码:177 / 180
页数:4
相关论文
共 50 条
  • [21] An Efficient HMM-Based Feature Enhancement Method With Filter Estimation for Reverberant Speech Recognition
    Cho, Ji-Won
    Park, Hyung-Min
    IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (12) : 1199 - 1202
  • [22] Robust Front-End based on MVA processing for Arabic Speech Recognition
    Techini, Elhem
    Sakka, Zied
    Bouhlel, MedSalim
    2017 INTERNATIONAL CONFERENCE ON ENGINEERING & MIS (ICEMIS), 2017,
  • [23] Separation of Reverberant Speech Based on Computational Auditory Scene Analysis
    Li Hongyan
    Cao Meng
    Wang Yue
    AUTOMATIC CONTROL AND COMPUTER SCIENCES, 2018, 52 (06) : 561 - 571
  • [24] Estimation of articulatory movements from speech acoustics using an HMM-based speech production model
    Hiroya, S
    Honda, M
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (02): : 175 - 185
  • [25] SPEECH RECOGNITION USING HMM BASED ON FUSION OF VISUAL AND AUDITORY INFORMATION
    SHINTANI, A
    OGIHARA, A
    YAMAGUCHI, Y
    HAYASHI, Y
    FUKUNAGA, K
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1994, E77A (11) : 1875 - 1878
  • [26] Lost Speech Reconstruction Method using Speech Recognition based on Missing Feature Theory and HMM-based Speech Synthesis
    Kuroiwa, Shingo
    Tsuge, Satoru
    Ren, Fuji
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1105 - 1108
  • [27] Isolated Tamil Digit Speech Recognition Using Template-Based and HMM-Based Approaches
    Karpagavalli, S.
    Deepika, R.
    Kokila, P.
    Rani, K. Usha
    Chandra, E.
    GLOBAL TRENDS IN INFORMATION SYSTEMS AND SOFTWARE APPLICATIONS, PT 2, 2012, 270 : 441 - +
  • [28] Efficient reduction of Gaussian components using MDL criterion for HMM-based speech recognition
    Shinoda, K
    Iso, K
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 869 - 872
  • [29] A noise robust front-end with low computational cost for embedded in-car speech recognition
    Ding, Pei
    He, Lei
    Yan, Xiang
    Zhao, Rui
    Hao, Jie
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1045 - +
  • [30] A Novel Approach to HMM-Based Speech Recognition System Using Particle Swarm Optimization
    Najkar, Negin
    Razzazi, Farbod
    Sameti, Hossein
    2009 FOURTH INTERNATIONAL CONFERENCE ON BIO-INSPIRED COMPUTING: THEORIES AND APPLICATIONS, PROCEEDINGS, 2009, : 296 - +