Word Pair Approximation for More Efficient Decoding with High-Order Language Models

被引：0

作者：

Nolden, David ^{[1
]}

Schlueter, Ralf ^{[1
]}

Ney, Hermann ^{[1
,2
]}

机构：

[1] Rhein Westfal TH Aachen, Ahornstr 55, D-52056 Aachen, Germany

[2] LIMSI CNRS, Spoken Language Proc Grp, Paris, France

来源：

15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 | 2014年

关键词：

efficient; decoding; search; rescoring; word pair approximation; context approximation; VOCABULARY CONTINUOUS SPEECH;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The search effort in LVCSR depends on the order of the language model (LM); search hypotheses are only recombined once the LM allows for it. In this work we show how the LM dependence can be partially eliminated by exploiting the well-known word pair approximation. We enforce preemptive unigram- or bigram-like LM recombination at word boundaries. We capture the recombination in a lattice, and later expand the lattice using LM rescoring. LM rescoring unfolds the same search space which would have been encountered without the preemptive recombination, but the overall efficiency is improved, because the amount of redundant HMM expansion in different LM contexts is reduced. Additionally, we show how to expand the recombined hypotheses on-the-fly, omitting the intermediate lattice form. Our new approach allows using the full n-gram LM for decoding, but based on a compact unigram- or bigram search space. We show that our approach works better than common lattice rescoring pipelines, where a pruned lower order LM is used to generate lattices; such pipelines suffer from the weak lower-order LM, which guides the pruning suboptimally. Our new decoding approach improves the runtime efficiency by up to 40% at equal precision when using a large vocabulary and high-order LM.

引用

页码：646 / 650

页数：5

共 18 条

[1] Allauzen C., 2009, Proceedings of the Conference of the International Speech Communication Association (ISCA), P1203
[2] Efficient WFST-based one-pass decoding with on-the-fly hypothesis rescoring in extremely large vocabulary continuous speech recognition
Hori, Takaaki
Hori, Chiori
Minami, Yasuhiro
Nakamura, Atsushi
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (04): : 1352 - 1365
[3] Ney H., 1994, WORD GRAPH ALGORITHM, V3, p1355
[4] Nolden D., 2012, ICASSP
[5] Nolden D., 2010, INTERSPEECH
[6] Nolden D., 2011, INTERSPEECH
[7] Nolden D., 2012, INTERSPEECH
[8] Nolden D., 2014, ICASSP
[9] Nolden D, 2013, INT CONF ACOUST SPEE, P6734, DOI 10.1109/ICASSP.2013.6638965
[10] Nolden D, 2011, INT CONF ACOUST SPEE, P4684

← 1 2 →