Feature enhancement for a bitstream-based front-end in wireless speech recognition

被引:0
作者
Kim, HK [1 ]
Cox, RV [1 ]
机构
[1] AT&T Labs Res, Florham Pk, NJ 07932 USA
来源
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING | 2001年
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a feature enhancement algorithm for wireless speech recognition in adverse acoustic environments. A speech recognition system is realized at the receiver side of a wireless communications system and feature parameters are extracted directly from the bitstream of the speech coder employed in the system. The feature parameters are composed of spectral envelope and coder-specific information. The proposed feature enhancement algorithm incorporates feature parameters obtained from the decoded speech and an enhanced version into the bitstream-based feature parameters. Moreover, the coder-specific parameters are improved by reestimating the codebook gains and residual energy from the enhanced residual signal. HMM-based connected digit recognition experiments show that the proposed feature enhancement algorithm significantly improves recognition accuracy at low SNR without causing poorer performance at high SNR.
引用
收藏
页码:241 / 244
页数:4
相关论文
共 13 条
[1]  
[Anonymous], 2012, ROBUSTNESS AUTOMATIC
[2]   Avoiding distortions due to speech coding and transmission errors in GSM ASR tasks [J].
Gallardo-Antolín, A ;
Díaz-de-María, F ;
Valverde-Albacete, F .
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, :277-280
[3]  
Honkanen T, 1997, INT CONF ACOUST SPEE, P731, DOI 10.1109/ICASSP.1997.596014
[4]  
KANG GS, 1985, P ICASSP TAMP FL MAR
[5]  
KIM HK, 2000, P ICASSP IST TURK JU, P1207
[6]  
KIM HK, 2000, UNPUB IEEE T SPEECH
[7]  
Lee C.-H., 1992, Computer Speech and Language, V6, P103, DOI 10.1016/0885-2308(92)90022-V
[8]  
Lilly BT, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P2344, DOI 10.1109/ICSLP.1996.607278
[9]  
MARTIN R, 1999, P 1999 IEEE WORKSH S, P165
[10]   Towards improving ASR robustness for PSN and GSM telephone applications [J].
Mokbel, C ;
Mauuary, L ;
Karray, L ;
Jouvet, D ;
Monne, J ;
Simonin, J ;
Bartkova, K .
SPEECH COMMUNICATION, 1997, 23 (1-2) :141-159