Analyzing and predicting language model improvements

被引：8

作者：

Iyer, R ^{[1
]}

Ostendorf, M ^{[1
]}

Meteer, M ^{[1
]}

机构：

[1] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA

来源：

1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS | 1997年

关键词：

D O I：

10.1109/ASRU.1997.659013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Statistical n-gram language models are traditionally developed using perplexity as a measure of goodness., However, perplexity often demonstrates a poor correlation with recognition improvements, mainly because ii fails to account for the acoustic confusability between words and for search errors in recognizer. In this paper, we study alternatives to perplexity for predicting language model performance, including other global features as well as a new approach that predicts, with a high correlation (0.96), performance differences associated with localized changes in language models given a recognition system. Experiments focus an the problem of augmenting in-domain Switchboard text with out-of-domain text from Wall Street Journal and Broadcast News that differ in both style and content from the in-domain data.

引用

页码：254 / 261

页数：8