Syllable language models for Mandarin speech recognition: Exploiting character language models

被引:18
|
作者
Liu, Xunying [1 ]
Hieronymus, James L. [2 ]
Gales, Mark J. F. [1 ]
Woodland, Philip C. [1 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
[2] Int Comp Sci Inst, Berkeley, CA 94704 USA
来源
关键词
CHINESE-LANGUAGE; ADAPTATION; ALGORITHM;
D O I
10.1121/1.4768800
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Mandarin Chinese is based on characters which are syllabic in nature and morphological in meaning. All spoken languages have syllabiotactic rules which govern the construction of syllables and their allowed sequences. These constraints are not as restrictive as those learned from word sequences, but they can provide additional useful linguistic information. Hence, it is possible to improve speech recognition performance by appropriately combining these two types of constraints. For the Chinese language considered in this paper, character level language models (LMs) can be used as a first level approximation to allowed syllable sequences. To test this idea, word and character level n-gram LMs were trained on 2.8 billion words (equivalent to 4.3 billion characters) of texts from a wide collection of text sources. Both hypothesis and model based combination techniques were investigated to combine word and character level LMs. Significant character error rate reductions up to 7.3% relative were obtained on a state-of-the-art Mandarin Chinese broadcast audio recognition task using an adapted history dependent multi-level LM that performs a log-linearly combination of character and word level LMs. This supports the hypothesis that character or syllable sequence models are useful for improving Mandarin speech recognition performance. (C) 2013 Acoustical Society of America. [http://dx.doi.org/10.1121/1.4768800]
引用
收藏
页码:519 / 528
页数:10
相关论文
共 50 条
  • [1] Syllable language models for Mandarin speech recognition: Exploiting character language models
    Liu, X. (xl207@eng.cam.ac.uk), 1600, Acoustical Society of America (133):
  • [2] Web-data augmented language models for Mandarin conversational speech recognition
    Ng, T
    Ostendorf, M
    Hwang, MY
    Siu, MH
    Bulyko, I
    Lei, X
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 589 - 592
  • [3] Analysis of syllable duration models for mandarin speech
    Lai, WH
    Chen, SH
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 497 - 500
  • [4] Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition
    Chen, Xie
    Liu, Xunying
    Wang, Yu
    Ragni, Anton
    Wong, Jeremy H. M.
    Gales, Mark J. F.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (09) : 1444 - 1454
  • [5] Exploiting Chinese Character Models to Improve Speech Recognition Performance
    Hieronymus, J. L.
    Liu, X.
    Gales, M. J. F.
    Woodland, P. C.
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 364 - +
  • [6] Gaussian mixture language models for speech recognition
    Afify, Mohamed
    Siohan, Olivier
    Sarikaya, Ruhi
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 29 - +
  • [7] Improving language models for radiology speech recognition
    Paulett, John M.
    Langlotz, Curtis P.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (01) : 53 - 58
  • [8] Language Models for Tamil Speech Recognition System
    Saraswathi, S.
    Geetha, T. V.
    IETE TECHNICAL REVIEW, 2007, 24 (05) : 375 - 383
  • [9] Discriminative training of language models for speech recognition
    Kuo, KHJ
    Fosler-Lussier, E
    Jiang, H
    Lee, CH
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 325 - 328
  • [10] GEOGRAPHIC LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION
    Xiao, Xiaoqiang
    Chen, Hong
    Zylak, Mark
    Sosa, Daniela
    Desu, Suma
    Krishnamoorthy, Mahesh
    Liu, Daben
    Paulik, Matthias
    Zhang, Yuchen
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6124 - 6128