Training a language model using webdata for large vocabulary Japanese spontaneous speech recognition

被引：0

作者：

Masumura, Ryo ^{[1
]}

Hahm, Seongjun ^{[1
]}

Ito, Akinori ^{[1
]}

机构：

[1] Tohoku Univ, Grad Sch Engn, Sendai, Miyagi 980, Japan

来源：

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年

关键词：

Spontaneous speech recognition; language model; World Wide Web; large vocabulary continuous speech recognition; Corpus of Spontaneous Japanese;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes a language modeling method using large-scale spoken language data retrieved from the Web for spontaneous speech recognition. We downloaded 15 million Web pages on a comprehensive range topics. Next, spoken language-like texts were selected from the downloaded Web data using the naive Bayes classifier, and typical linguistic phenomena such as fillers and pauses were added using simulation models. A language model trained by the generated data gave as high performance as the large-scale spontaneous speech corpus (Corpus of Spontaneous Japanese, CSJ). By combining the generated data and CSJ, we improved word accuracy.

引用

页码：1476 / 1479

页数：4

共 50 条

[31] A language model using variable length tokens for open-vocabulary Hangul text recognition
Ryu, SH
Kim, JH
PATTERN RECOGNITION, 2004, 37 (07) : 1549 - 1552
[32] A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Karita, Shigeki
Kubo, Yotaro
Bacchiani, Michiel Adriaan Unico
Jones, Llion
INTERSPEECH 2021, 2021, : 2092 - 2096
[33] A Language Model for Intelligent Speech Recognition of Power Dispatching
Zhao, Qing
Li, Tingrui
Luo, Rui
Li, Rui
Han, Tianyu
Han, Dongsheng
PROCEEDINGS OF ACM TURING AWARD CELEBRATION CONFERENCE, ACM TURC 2021, 2021, : 131 - 135
[34] Automatic Clustering of Part-of-speech for Vocabulary Divided PLSA Language Model
Suzuki, Motoyuki
Kuriyama, Naoto
Ito, Akinori
Makino, Shozo
IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 289 - +
[35] Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
Meng, Zhong
Wu, Yu
Kanda, Naoyuki
Lu, Liang
Chen, Xie
Ye, Guoli
Sun, Eric
Li, Jinyu
Gong, Yifan
INTERSPEECH 2021, 2021, : 2596 - 2600
[36] Speech Recognition with Word Fragment Detection Using Prosody Features for Spontaneous Speech
Yeh, Jui-Feng
Yen, Ming-Chi
APPLIED MATHEMATICS & INFORMATION SCIENCES, 2012, 6 (02): : 669S - 675S
[37] A generalized dynamic composition algorithm of weighted finite state transducers for large vocabulary speech recognition
Cheng, Octavian
Dines, John
Doss, Mathew Magimai
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 345 - +
[38] Residual Language Model for End-to-end Speech Recognition
Tsunoo, Emiru
Kashiwagi, Yosuke
Narisetty, Chaitanya
Watanabe, Shinji
INTERSPEECH 2022, 2022, : 3899 - 3903
[39] A New Bigram-PLSA Language Model for Speech Recognition
Mohammad Bahrani
Hossein Sameti
EURASIP Journal on Advances in Signal Processing, 2010
[40] ON LANGUAGE MODEL INTEGRATION FOR RNN TRANSDUCER BASED SPEECH RECOGNITION
Zhou, Wei
Zheng, Zuoyun
Schlueter, Ralf
Ney, Hermann
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8407 - 8411

← 1 2 3 4 5 →