Vietnamese Large Vocabulary Continuous Speech Recognition

被引:12
|
作者
Ngoc Thang Vu [1 ]
Schultz, Tanja [1 ]
机构
[1] Univ Karlsruhe, Inst Anthropomat, Cognit Syst Lab, Karlsruhe, Germany
关键词
D O I
10.1109/ASRU.2009.5373424
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We report on our recent efforts toward a large vocabulary Vietnamese speech recognition system. In particular, we describe the Vietnamese text and speech database recently collected as part of our Global Phone corpus. The data was complemented by a large collection of text data crawled from various Vietnamese websites. To bootstrap the Vietnamese speech recognition system we used our Rapid Language Adaptation scheme applying a multilingual phone inventory. After initialization we investigated the peculiarities of the Vietnamese language and achieved significant improvements by implementing different tone modeling schemes, extended by pitch extraction, handling multiwords to address the monosyllable structure of Vietnamese, and featuring language modeling based on 5-grams. Furthermore, we addressed the issue of dialectal variations between South and North Vietnam by creating dialect dependent pronunciations and including dialect in the context decision tree of the recognizer. Our currently best recognition system achieves a word error rate of 11.7% on read newspaper speech.
引用
收藏
页码:333 / 338
页数:6
相关论文
共 50 条
  • [1] Development of a Vietnamese Large Vocabulary Continuous Speech Recognition System under Noisy Conditions
    Quoc Bao Nguyen
    Van Tuan Mai
    Quang Trung Le
    Ba Quyen Dam
    Van Hai Do
    PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY (SOICT 2018), 2018, : 222 - 226
  • [2] Phoneme Set and Pronouncing Dictionary Creation for Large Vocabulary Continuous Speech Recognition of Vietnamese
    Thien Chuong Nguyen
    Chaloupka, Josef
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 394 - 401
  • [3] Advances in large vocabulary continuous speech recognition
    Zweig, G
    Picheny, M
    ADVANCES IN COMPUTERS, VOL. 60: INFORMATION SECURITY, 2004, 60 : 249 - 291
  • [4] Developments in large vocabulary, continuous speech recognition of German
    AddaDecker, M
    Adda, G
    Lamel, L
    Gauvain, JL
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 153 - 156
  • [5] Utilizing Lipreading in Large Vocabulary Continuous Speech Recognition
    Palecek, Karel
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 767 - 776
  • [6] The RWTH large vocabulary continuous speech recognition system
    Ney, H
    Welling, L
    Ortmanns, S
    Beulen, K
    Wessel, F
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 853 - 856
  • [7] Combating Reverberation in Large Vocabulary Continuous Speech Recognition
    Mitra, Vikramjit
    Van Hout, Julien
    McLaren, Mitchell
    Wang, Wen
    Graciarena, Martin
    Vergyri, Dimitra
    Franco, Horacio
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2449 - 2453
  • [8] Accent Issues in Large Vocabulary Continuous Speech Recognition
    Chao Huang
    Tao Chen
    Eric Chang
    International Journal of Speech Technology, 2004, 7 (2-3) : 141 - 153
  • [9] Experimenting with lipreading for large vocabulary continuous speech recognition
    Palecek, Karel
    JOURNAL ON MULTIMODAL USER INTERFACES, 2018, 12 (04) : 309 - 318
  • [10] Confidence measures for large vocabulary continuous speech recognition
    Wessel, F
    Schlüter, R
    Macherey, K
    Ney, H
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03): : 288 - 298