Direct Construction of Compact Context-Dependency Transducers From Data

被引:0
作者
Rybach, David [1 ]
Riley, Michael [2 ]
机构
[1] Rhein Westfal TH Aachen, Dept Comp Sci, Human Language Technol & Pattern Recognit, Aachen, Germany
[2] Google Inc, New York, NY USA
来源
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2 | 2010年
关键词
WFST; LVCSR; MINIMIZATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a new method for building compact context-dependency transducers for finite-state transducer-based ASR decoders. Instead of the conventional phonetic decision-tree growing followed by FST compilation, this approach incorporates the phonetic context splitting directly into the transducer construction. The objective function of the split optimization is augmented with a regularization term that measures the number of transducer states introduced by a split. We give results on a large spoken-query task for various n-phone orders and other phonetic features that show this method can greatly reduce the size of the resulting context-dependency transducer with no significant impact on recognition accuracy. This permits using context sizes and features that might otherwise be unmanageable.
引用
收藏
页码:218 / +
页数:2
相关论文
共 11 条
  • [1] Allauzen C., 2009, Proceedings of the Conference of the International Speech Communication Association (ISCA), P1203
  • [2] Chen S., 2003, P EUR C SPEECH COMM, P1169
  • [3] Advances in speech transcription at IBM under the DARPA EARS program
    Chen, Stanley F.
    Kingsbury, Brian
    Mangu, Lidia
    Povey, Daniel
    Saon, George
    Soltau, Hagen
    Zweig, Geoffrey
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05): : 1596 - 1608
  • [4] Minimization algorithms for sequential transducers
    Mohri, M
    [J]. THEORETICAL COMPUTER SCIENCE, 2000, 234 (1-2) : 177 - 201
  • [5] Mohri M., 2008, Speech Recognition with Weighted Finite-State Transducers, P559, DOI DOI 10.1007/978-3-540-49127-9_28
  • [6] Pereira F., 1997, P EUROSPEECH RHOD GR, P1427
  • [7] Schuster M, 2005, 2005 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), P162
  • [8] Schuster M, 2005, INT CONF ACOUST SPEE, P201
  • [9] SPROAT R, 1996, P 34 ANN M ASS COMP, P215
  • [10] Young S., 1994, Proc. ARPA Spoken Language Technology Workshop, P405