Modeling auditory perception to improve robust speech recognition

被引:0
作者
Strope, B [1 ]
Alwan, A [1 ]
机构
[1] Univ Calif Los Angeles, Dept Elect Engn, Los Angeles, CA 90095 USA
来源
THIRTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1 AND 2 | 1998年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
While non-stationary stochastic techniques have led to substantial improvements in vocabulary size and speaker independence, most automatic speech recognition (ASR) systems remain overly sensitive to the acoustic environment, precluding robust widespread applications. Our approach to this problem has been to model fundamental aspects of auditory perception, which are typically neglected in common ASR front ends, to derive a more robust and phonetically relevant parameterization of speech. Short-term adaptation and recovery, a sensitivity to local spectral peaks, together with an explicit parameterization of the position and motion of local spectral peaks reduces the error rate of a word recognition task by as much as a factor of 4. Current work also investigates the perceptual significance of pitch-rate amplitude-modulation cues in noise.
引用
收藏
页码:1056 / 1060
页数:5
相关论文
empty
未找到相关数据