Multi-pass pronunciation adaptation

被引：0

作者：

Bodenstab, Nathan ^{[1
]}

Fanty, Mark ^{[2
]}

机构：

[1] Oregon Hlth & Sci Univ, OGI, Portland, OR 97201 USA

[2] Nuance Commun, Sunnyvale, CA USA

来源：

2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 | 2007年

关键词：

pronunciation; speech; learning; adaptation;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

A mapping between words and pronunciations (potential phonetic realizations) is a key component of speech recognition systems. Traditionally, this has been encoded in a lexicon where each pronunciation is transcribed by a linguist or generated by a grapheme-to-phoneme algorithm. For large vocabulary recognition systems, this process is highly susceptible to errors. We present an off-line data driven algorithm to correct suboptimal pronunciations using transcribed utterances. Unlike previous data driven algorithms that struggle to balance acoustic representation and multi-speaker generalization, our multi-pass approach maximizes both criteria, instead of compromising between the two. We demonstrate on an automated name dialing task that our multipass algorithm achieves a 70% error rate reduction when compared to a baseline grapheme-to-phoneme generated lexicon.

引用

页码：865 / +

页数：2