UNIFIED ASR SYSTEM USING LGM-BASED SOURCE SEPARATION, NOISE-ROBUST FEATURE EXTRACTION, AND WORD HYPOTHESIS SELECTION

被引:0
|
作者
Fujita, Yusuke [1 ]
Takashima, Ryoichi [1 ]
Homma, Takeshi [1 ]
Ikeshita, Rintaro [1 ]
Kawaguchi, Yohei [1 ]
Sumiyoshi, Takashi [1 ]
Endo, Takashi [1 ]
Togami, Masahito [1 ]
机构
[1] Hitachi Ltd, Res & Dev Grp, Bengaluru, Karnataka, India
来源
2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU) | 2015年
关键词
CHiME-3; local Gaussian modeling; noise-aware training; word hypothesis selection; SPEECH ENHANCEMENT; SUPPRESSION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a unified system that incorporates speech source separation and automatic speech recognition for various noise environments. There are three features in the proposed system. The first feature of the proposed method is the LGM (local Gaussian modeling) based source separation with the efficient permutation alignment method that integrates a power spectrum correlation based method and a direction-of-arrival (DOA) based method. Evaluation results show that using the separated speech with the baseline acoustic modeling method reduces the word error rate (WER) significantly. The second feature of the proposed method is multi-condition training with per-utterance normalized features and noise-aware features in the acoustic modeling step. In this paper, we show that the proposed training method is effective even when an input signal has been distorted through the source separation step. The third feature is the word hypothesis selection method for integrating multiple recognition results. The proposed selection method estimates correct words based on a recognizer's confidence and co-occurrence characteristics. The evaluation results show that the proposed selection method outperforms the conventional recognizer output voting error reduction (ROVER) method. The proposed system is evaluated using the third CHiME challenge dataset. Evaluation results show that the proposed system resulted in an improvement of 66.1% over the baseline system.
引用
收藏
页码:416 / 422
页数:7
相关论文
empty
未找到相关数据