SNR-Based Mask Compensation for Computational Auditory Scene Analysis Applied to Speech Recognition in a Car Environment

被引：0

作者：

Park, Ji Hun ^{[1
]}

Kim, Seon Man ^{[1
]}

Yoon, Jae Sam ^{[1
]}

Kim, Hong Kook ^{[1
]}

Lee, Sung Joo ^{[2
]}

Lee, Yunkeun ^{[2
]}

机构：

[1] Gwangju Inst Sci & Technol, Sch Informat & Commun, Kwangju 500712, South Korea

[2] Elect & Telecommun Res Inst, Speech Proc Team, Speech & Language Informat Res Div, Daejeon 305350, South Korea

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2 | 2010年

关键词：

Speech recognition; speech separation; computational auditory scene analysis; mask compensation; beamforming;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a computational auditory scene analysis (CASA)-based front-end for two microphone speech recognition in a car environment. One of the important issues associated with CASA is the accurate estimation of mask information for target speech separation within multiple microphone noisy speech. For such a task, the time frequency mask information is compensated through the signal to noise ratio resulted from a beamformer to adjust the noise quantity included in noisy speech. We evaluate the performance of an automatic speech recognition (ASR) system employing a CASA-based front-end with the proposed mask compensation method. In addition, we compare its performance with those employing a CASA-based front-end without mask compensation and the beamforming based front-end. As a result, the CASA-based front-end achieves an average word error rate (WER) reduction of 8.57% when the proposed mask compensation method is applied. In addition, the CASA-based front-end with the proposed method provides a relative WER reduction of 26.52%, compared with the beamforming-based front-end.

引用

页码：725 / +

页数：2

共 50 条

[1] HMM-based mask estimation for a speech recognition front-end using computational auditory scene analysis
Park, Ji Hun
Yoon, Jae Sam
Kim, Hong Kook
2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS, 2008, : 177 - 180
[2] HMM-Based mask estimation for a speech recognition front-end using computational auditory scene analysis
Park, Ji Hun
Yoon, Jae Sam
Kim, Hong Kook
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (09) : 2360 - 2364
[3] A Computational Auditory Scene Analysis System for Robust Speech Recognition
Srinivasan, Soundararajan
Shao, Yang
Jin, Zhaozhang
Wang, DeLiang
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 73 - +
[4] Linking computational auditory scene analysis to automatic speech recognition
Cooke, M
Morris, A
Green, P
ACUSTICA, 1996, 82 : S87 - S87
[5] A computational auditory scene analysis system for speech segregation and robust speech recognition
Shao, Yang
Srinivasan, Soundararajan
Jin, Zhaozhang
Wang, DeLiang
COMPUTER SPEECH AND LANGUAGE, 2010, 24 (01): : 77 - 93
[6] A series of SNR-based speech intelligibility models in the Auditory Modeling Toolbox
Lavandier, Mathieu
Vicente, Thibault
Prud'homme, Luna
ACTA ACUSTICA, 2022, 6
[7] Separation of Reverberant Speech Based on Computational Auditory Scene Analysis
Li Hongyan
Cao Meng
Wang Yue
AUTOMATIC CONTROL AND COMPUTER SCIENCES, 2018, 52 (06) : 561 - 571
[8] Robust front-end for speech recognition based on computational auditory scene analysis and speaker model
Guan, Yong
Li, Peng
Liu, Wen-Ju
Xu, Bo
Zidonghua Xuebao/ Acta Automatica Sinica, 2009, 35 (04): : 410 - 416
[9] Improved monaural speech segregation based on computational auditory scene analysis
Wang Yu
Lin Jiajun
Chen Ning
Yuan Wenhao
EURASIP Journal on Audio, Speech, and Music Processing, 2013
[10] Improved monaural speech segregation based on computational auditory scene analysis
Wang Yu
Lin Jiajun
Chen Ning
Yuan Wenhao
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2013,

← 1 2 3 4 5 →