Distant-talking speech recognition with microphone-array sound pickup and NN/MLLR environment equalization

被引:0
作者
Lin, QG [1 ]
Flanagan, J [1 ]
Che, CW [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Heights, NY 10598 USA
来源
PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2 | 1998年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most contemporary speech recognizers are designed to operate with dose-talking speech and they work best in a quiet, matched training/testing condition. The objective of the paper is to explore utility of existing speech recognition technology in adverse "real-world" environments for distant-talking applications. More specifically, a microphones array (MA) is utilized as effective sound capture to mitigate environmental interference introduced by reverberation and ambient noise, and environment equalizers are employed to approximate a matched training/testing condition for the recognizer such that it need not be retained. Both neural network (NN) and maximum Likelihood Linear regression (MLLR) techniques have been used for environment equalization. Experimental results show that the combined system of MANN-MLLR is able to elevate recognition performance of distant-talking to an extent which is competitive with a retrained speech recognizer.
引用
收藏
页码:1099 / 1102
页数:4
相关论文
共 10 条
[1]  
ACERO A, ICASSP 90, P849
[2]  
CHE C, P 1996 ARPA SLT WORK
[3]  
CHE C, P 1994 ARPA HLT WORK, P342
[4]  
FLANAGAN JL, 1991, ACUSTICA, V73, P58
[5]  
GIULIANI D, EUROSPEECH 95, P2021
[6]  
LEGGETER C, P ICSLP 94, P451
[7]  
LIN Q, 1994, IEEE T SPEECH AUDIO, P622
[8]  
LIN Q, P ICASSP 96, P21
[9]  
NAKAMURA S, ICASSP 96, P69
[10]  
YUK D, ICASSP 96, P3358