Analyzing and predicting language model improvements

被引:8
作者
Iyer, R [1 ]
Ostendorf, M [1 ]
Meteer, M [1 ]
机构
[1] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA
来源
1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS | 1997年
关键词
D O I
10.1109/ASRU.1997.659013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Statistical n-gram language models are traditionally developed using perplexity as a measure of goodness., However, perplexity often demonstrates a poor correlation with recognition improvements, mainly because ii fails to account for the acoustic confusability between words and for search errors in recognizer. In this paper, we study alternatives to perplexity for predicting language model performance, including other global features as well as a new approach that predicts, with a high correlation (0.96), performance differences associated with localized changes in language models given a recognition system. Experiments focus an the problem of augmenting in-domain Switchboard text with out-of-domain text from Wall Street Journal and Broadcast News that differ in both style and content from the in-domain data.
引用
收藏
页码:254 / 261
页数:8
相关论文
empty
未找到相关数据