Direct Construction of Compact Context-Dependency Transducers From Data

被引：0

作者：

Rybach, David ^{[1
]}

Riley, Michael ^{[2
]}

机构：

[1] Rhein Westfal TH Aachen, Dept Comp Sci, Human Language Technol & Pattern Recognit, Aachen, Germany

[2] Google Inc, New York, NY USA

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2 | 2010年

关键词：

WFST; LVCSR; MINIMIZATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes a new method for building compact context-dependency transducers for finite-state transducer-based ASR decoders. Instead of the conventional phonetic decision-tree growing followed by FST compilation, this approach incorporates the phonetic context splitting directly into the transducer construction. The objective function of the split optimization is augmented with a regularization term that measures the number of transducer states introduced by a split. We give results on a large spoken-query task for various n-phone orders and other phonetic features that show this method can greatly reduce the size of the resulting context-dependency transducer with no significant impact on recognition accuracy. This permits using context sizes and features that might otherwise be unmanageable.

引用

页码：218 / +

页数：2

共 11 条

[1] Allauzen C., 2009, Proceedings of the Conference of the International Speech Communication Association (ISCA), P1203
[2] Chen S., 2003, P EUR C SPEECH COMM, P1169
[3] Advances in speech transcription at IBM under the DARPA EARS program
Chen, Stanley F.
Kingsbury, Brian
Mangu, Lidia
Povey, Daniel
Saon, George
Soltau, Hagen
Zweig, Geoffrey
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05): : 1596 - 1608
[4] Minimization algorithms for sequential transducers
Mohri, M
[J]. THEORETICAL COMPUTER SCIENCE, 2000, 234 (1-2) : 177 - 201
[5] Mohri M., 2008, Speech Recognition with Weighted Finite-State Transducers, P559, DOI DOI 10.1007/978-3-540-49127-9_28
[6] Pereira F., 1997, P EUROSPEECH RHOD GR, P1427
[7] Schuster M, 2005, 2005 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), P162
[8] Schuster M, 2005, INT CONF ACOUST SPEE, P201
[9] SPROAT R, 1996, P 34 ANN M ASS COMP, P215
[10] Young S., 1994, Proc. ARPA Spoken Language Technology Workshop, P405

← 1 2 →