Training a language model using webdata for large vocabulary Japanese spontaneous speech recognition

被引:0
|
作者
Masumura, Ryo [1 ]
Hahm, Seongjun [1 ]
Ito, Akinori [1 ]
机构
[1] Tohoku Univ, Grad Sch Engn, Sendai, Miyagi 980, Japan
来源
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年
关键词
Spontaneous speech recognition; language model; World Wide Web; large vocabulary continuous speech recognition; Corpus of Spontaneous Japanese;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a language modeling method using large-scale spoken language data retrieved from the Web for spontaneous speech recognition. We downloaded 15 million Web pages on a comprehensive range topics. Next, spoken language-like texts were selected from the downloaded Web data using the naive Bayes classifier, and typical linguistic phenomena such as fillers and pauses were added using simulation models. A language model trained by the generated data gave as high performance as the large-scale spontaneous speech corpus (Corpus of Spontaneous Japanese, CSJ). By combining the generated data and CSJ, we improved word accuracy.
引用
收藏
页码:1476 / 1479
页数:4
相关论文
共 50 条
  • [31] A language model using variable length tokens for open-vocabulary Hangul text recognition
    Ryu, SH
    Kim, JH
    PATTERN RECOGNITION, 2004, 37 (07) : 1549 - 1552
  • [32] A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
    Karita, Shigeki
    Kubo, Yotaro
    Bacchiani, Michiel Adriaan Unico
    Jones, Llion
    INTERSPEECH 2021, 2021, : 2092 - 2096
  • [33] A Language Model for Intelligent Speech Recognition of Power Dispatching
    Zhao, Qing
    Li, Tingrui
    Luo, Rui
    Li, Rui
    Han, Tianyu
    Han, Dongsheng
    PROCEEDINGS OF ACM TURING AWARD CELEBRATION CONFERENCE, ACM TURC 2021, 2021, : 131 - 135
  • [34] Automatic Clustering of Part-of-speech for Vocabulary Divided PLSA Language Model
    Suzuki, Motoyuki
    Kuriyama, Naoto
    Ito, Akinori
    Makino, Shozo
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 289 - +
  • [35] Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
    Meng, Zhong
    Wu, Yu
    Kanda, Naoyuki
    Lu, Liang
    Chen, Xie
    Ye, Guoli
    Sun, Eric
    Li, Jinyu
    Gong, Yifan
    INTERSPEECH 2021, 2021, : 2596 - 2600
  • [36] Speech Recognition with Word Fragment Detection Using Prosody Features for Spontaneous Speech
    Yeh, Jui-Feng
    Yen, Ming-Chi
    APPLIED MATHEMATICS & INFORMATION SCIENCES, 2012, 6 (02): : 669S - 675S
  • [37] A generalized dynamic composition algorithm of weighted finite state transducers for large vocabulary speech recognition
    Cheng, Octavian
    Dines, John
    Doss, Mathew Magimai
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 345 - +
  • [38] Residual Language Model for End-to-end Speech Recognition
    Tsunoo, Emiru
    Kashiwagi, Yosuke
    Narisetty, Chaitanya
    Watanabe, Shinji
    INTERSPEECH 2022, 2022, : 3899 - 3903
  • [39] A New Bigram-PLSA Language Model for Speech Recognition
    Mohammad Bahrani
    Hossein Sameti
    EURASIP Journal on Advances in Signal Processing, 2010
  • [40] ON LANGUAGE MODEL INTEGRATION FOR RNN TRANSDUCER BASED SPEECH RECOGNITION
    Zhou, Wei
    Zheng, Zuoyun
    Schlueter, Ralf
    Ney, Hermann
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8407 - 8411