HMM-based mask estimation for a speech recognition front-end using computational auditory scene analysis

被引:0
|
作者
Park, Ji Hun [1 ]
Yoon, Jae Sam [1 ]
Kim, Hong Kook [1 ]
机构
[1] GIST, Dept Informat & Commun, Kwangju 500712, South Korea
来源
2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS | 2008年
关键词
computational auditory scene analysis; mask estimation; hidden Markov model; speech recognition; noise robustness;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a new mask estimation method for the computational auditory scene analysis (CASA) of speech using two microphones. The proposed method is based on a hidden Markov model (HMM) in order to incorporate an observation that the mask information should be correlated over contiguous analysis frames. In other words, HMM is used to estimate the mask information represented as the interaural time difference (ITD) and the interaural level difference (ILD) of two channel signals, and the estimated mask information is finally employed in the separation of desired speech from noisy speech. To show the effectiveness of the proposed mask estimation, we then compare the performance of the proposed method with that of a Gaussian kernel-based estimation method in terms of the performance of speech recognition. As a result, the proposed HMM-based mask estimation method provided an average word error rate reduction of 69.14% when compared with the Gaussian kernel-based mask estimation method.
引用
收藏
页码:177 / 180
页数:4
相关论文
共 50 条
  • [1] HMM-Based mask estimation for a speech recognition front-end using computational auditory scene analysis
    Park, Ji Hun
    Yoon, Jae Sam
    Kim, Hong Kook
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (09) : 2360 - 2364
  • [2] Robust front-end for speech recognition based on computational auditory scene analysis and speaker model
    Guan, Yong
    Li, Peng
    Liu, Wen-Ju
    Xu, Bo
    Zidonghua Xuebao/ Acta Automatica Sinica, 2009, 35 (04): : 410 - 416
  • [3] Front-end design by using auditory modeling in speech recognition
    Tian, JL
    Laurila, K
    Hariharan, R
    Kiss, I
    COMPUTATIONAL MODELS OF AUDITORY FUNCTION, 2001, 312 : 329 - 342
  • [4] Auditory masking based acoustic front-end for robust speech recognition
    Paliwal, KK
    Lilly, BT
    IEEE TENCON'97 - IEEE REGIONAL 10 ANNUAL CONFERENCE, PROCEEDINGS, VOLS 1 AND 2: SPEECH AND IMAGE TECHNOLOGIES FOR COMPUTING AND TELECOMMUNICATIONS, 1997, : 165 - 168
  • [5] HMM-Based Speech Recognition Using Adaptive Framing
    Goh, Yeh-Huann
    Raveendran, Paramesran
    TENCON 2009 - 2009 IEEE REGION 10 CONFERENCE, VOLS 1-4, 2009, : 226 - 230
  • [6] Robust connected digit recognition using speech enhancement and an auditory model front-end
    Flynn, Ronan
    Jones, Edward
    2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, VOLS 1-4, 2007, : 410 - +
  • [7] SNR-Based Mask Compensation for Computational Auditory Scene Analysis Applied to Speech Recognition in a Car Environment
    Park, Ji Hun
    Kim, Seon Man
    Yoon, Jae Sam
    Kim, Hong Kook
    Lee, Sung Joo
    Lee, Yunkeun
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 725 - +
  • [8] An Efferent-Inspired Auditory Model Front-End for Speech Recognition
    Lee, Chia-ying
    Glass, James
    Ghitza, Oded
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 56 - +
  • [9] A Computational Auditory Scene Analysis System for Robust Speech Recognition
    Srinivasan, Soundararajan
    Shao, Yang
    Jin, Zhaozhang
    Wang, DeLiang
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 73 - +
  • [10] Linking computational auditory scene analysis to automatic speech recognition
    Cooke, M
    Morris, A
    Green, P
    ACUSTICA, 1996, 82 : S87 - S87