Dirichlet Class Language Models for Speech Recognition

被引：47

作者：

Chien, Jen-Tzung ^{[1
]}

Chueh, Chuang-Hua ^{[1
]}

机构：

[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2011年 / 19卷 / 03期

关键词：

Bayes procedure; clustering method; natural language; smoothing method; speech recognition;

D O I：

10.1109/TASL.2010.2050717

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Latent Dirichlet allocation (LDA) was successfully developed for document modeling due to its generalization to unseen documents through the latent topic modeling. LDA calculates the probability of a document based on the bag-of-words scheme without considering the order of words. Accordingly, LDA cannot be directly adopted to predict words in speech recognition systems. This work presents a new Dirichlet class language model (DCLM), which projects the sequence of history words onto a latent class space and calculates a marginal likelihood over the uncertainties of classes, which are expressed by Dirichlet priors. A Bayesian class-based language model is established and a variational Bayesian procedure is presented for estimating DCLM parameters. Furthermore, the long-distance class information is continuously updated using the large-span history words and is dynamically incorporated into class mixtures for a cache DCLM. Different language models are experimentally evaluated using the Wall Street Journal (WSJ) corpus. The amount of training data and the size of vocabulary are evaluated. We find that the cache DCLM effectively characterizes the unseen n-gram events and stores the class information for long-distance language modeling. This approach out-performs the other class-based and topic-based language models in terms of perplexity and recognition accuracy. The DCLM and cache DCLM achieved relative gain of word error rate by 3% to 5% over the LDA topic-based language model with different sizes of training data.

引用

页码：482 / 495

页数：14

共 32 条

[1]

[Anonymous], 2006, P 23 INT C MACH LEAR, DOI DOI 10.1145/1143844.1143967

[2]

[Anonymous], P INT 07 ANTW BELG A

[3]

[Anonymous], 2006, Pattern recognition and machine learning

[4]

Bai SH, 1998, INT CONF ACOUST SPEE, P173, DOI 10.1109/ICASSP.1998.674395

[5] Exploiting latent semantic information in statistical language modeling [J].

Bellegarda, JR .

PROCEEDINGS OF THE IEEE, 2000, 88 (08) :1279-1296

[6]

Bengio Y, 2001, ADV NEUR IN, V13, P932

[7]

Bisani M, 2004, 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS, P409

[8] Latent Dirichlet allocation [J].

Blei, DM ;

Ng, AY ;

Jordan, MI .

JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022

[9]

Brown P. F., 1992, Computational Linguistics, V18, P467

[10] Adaptive Bayesian latent semantic analysis [J].

Chien, Jen-Tzung ;

Wu, Meng-Sung .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (01) :198-207

← 1 2 3 4 →