HMM-based mask estimation for a speech recognition front-end using computational auditory scene analysis

被引：0

作者：

Park, Ji Hun ^{[1
]}

Yoon, Jae Sam ^{[1
]}

Kim, Hong Kook ^{[1
]}

机构：

[1] GIST, Dept Informat & Commun, Kwangju 500712, South Korea

来源：

2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS | 2008年

关键词：

computational auditory scene analysis; mask estimation; hidden Markov model; speech recognition; noise robustness;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a new mask estimation method for the computational auditory scene analysis (CASA) of speech using two microphones. The proposed method is based on a hidden Markov model (HMM) in order to incorporate an observation that the mask information should be correlated over contiguous analysis frames. In other words, HMM is used to estimate the mask information represented as the interaural time difference (ITD) and the interaural level difference (ILD) of two channel signals, and the estimated mask information is finally employed in the separation of desired speech from noisy speech. To show the effectiveness of the proposed mask estimation, we then compare the performance of the proposed method with that of a Gaussian kernel-based estimation method in terms of the performance of speech recognition. As a result, the proposed HMM-based mask estimation method provided an average word error rate reduction of 69.14% when compared with the Gaussian kernel-based mask estimation method.

引用

页码：177 / 180

页数：4

共 50 条

[21] An Efficient HMM-Based Feature Enhancement Method With Filter Estimation for Reverberant Speech Recognition
Cho, Ji-Won
Park, Hyung-Min
IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (12) : 1199 - 1202
[22] Robust Front-End based on MVA processing for Arabic Speech Recognition
Techini, Elhem
Sakka, Zied
Bouhlel, MedSalim
2017 INTERNATIONAL CONFERENCE ON ENGINEERING & MIS (ICEMIS), 2017,
[23] Separation of Reverberant Speech Based on Computational Auditory Scene Analysis
Li Hongyan
Cao Meng
Wang Yue
AUTOMATIC CONTROL AND COMPUTER SCIENCES, 2018, 52 (06) : 561 - 571
[24] Estimation of articulatory movements from speech acoustics using an HMM-based speech production model
Hiroya, S
Honda, M
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (02): : 175 - 185
[25] SPEECH RECOGNITION USING HMM BASED ON FUSION OF VISUAL AND AUDITORY INFORMATION
SHINTANI, A
OGIHARA, A
YAMAGUCHI, Y
HAYASHI, Y
FUKUNAGA, K
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1994, E77A (11) : 1875 - 1878
[26] Lost Speech Reconstruction Method using Speech Recognition based on Missing Feature Theory and HMM-based Speech Synthesis
Kuroiwa, Shingo
Tsuge, Satoru
Ren, Fuji
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1105 - 1108
[27] Isolated Tamil Digit Speech Recognition Using Template-Based and HMM-Based Approaches
Karpagavalli, S.
Deepika, R.
Kokila, P.
Rani, K. Usha
Chandra, E.
GLOBAL TRENDS IN INFORMATION SYSTEMS AND SOFTWARE APPLICATIONS, PT 2, 2012, 270 : 441 - +
[28] Efficient reduction of Gaussian components using MDL criterion for HMM-based speech recognition
Shinoda, K
Iso, K
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 869 - 872
[29] A noise robust front-end with low computational cost for embedded in-car speech recognition
Ding, Pei
He, Lei
Yan, Xiang
Zhao, Rui
Hao, Jie
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1045 - +
[30] A Novel Approach to HMM-Based Speech Recognition System Using Particle Swarm Optimization
Najkar, Negin
Razzazi, Farbod
Sameti, Hossein
2009 FOURTH INTERNATIONAL CONFERENCE ON BIO-INSPIRED COMPUTING: THEORIES AND APPLICATIONS, PROCEEDINGS, 2009, : 296 - +

← 1 2 3 4 5 →