Training a language model using webdata for large vocabulary Japanese spontaneous speech recognition

被引：0

作者：

Masumura, Ryo ^{[1
]}

Hahm, Seongjun ^{[1
]}

Ito, Akinori ^{[1
]}

机构：

[1] Tohoku Univ, Grad Sch Engn, Sendai, Miyagi 980, Japan

来源：

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年

关键词：

Spontaneous speech recognition; language model; World Wide Web; large vocabulary continuous speech recognition; Corpus of Spontaneous Japanese;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes a language modeling method using large-scale spoken language data retrieved from the Web for spontaneous speech recognition. We downloaded 15 million Web pages on a comprehensive range topics. Next, spoken language-like texts were selected from the downloaded Web data using the naive Bayes classifier, and typical linguistic phenomena such as fillers and pauses were added using simulation models. A language model trained by the generated data gave as high performance as the large-scale spontaneous speech corpus (Corpus of Spontaneous Japanese, CSJ). By combining the generated data and CSJ, we improved word accuracy.

引用

页码：1476 / 1479

页数：4

共 50 条

[41] Language Model Based Non-speech Recognition Method
Zhang, Qinglin
Chen, Jianfeng
Bai, Jisheng
CONFERENCE PROCEEDINGS OF 2019 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2019), 2019,
[42] A SYNCHRONIZED PRUNING COMPOSITION ALGORITHM OF WEIGHTED FINITE STATE TRANSDUCERS FOR LARGE VOCABULARY SPEECH RECOGNITION
He, Zhiyang
Lv, Ping
Li, Wei
Wu, Ji
2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 11 - 15
[43] Language Modeling for Mixed Language Speech Recognition using Weighted Phrase Extraction
Li, Ying
Fung, Pascale
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2598 - 2602
[44] Evaluating Spoken Language Model Based on Filler Prediction Model in Speech Recognition
Ohta, Kengo
Tsuchiya, Masatoshi
Nakagawa, Seiichi
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1558 - +
[45] Continuous Speech Recognition of Kannada Language using Triphone Modeling
Sajjan, Sharada C.
Vijaya, C.
PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 451 - 455
[46] How does language model size effects speech recognition accuracy for the Turkish language?
Asefisaray, Behnam
Mengusoglu, Erhan
Haciomeroglu, Murat
Sever, Hayri
PAMUKKALE UNIVERSITY JOURNAL OF ENGINEERING SCIENCES-PAMUKKALE UNIVERSITESI MUHENDISLIK BILIMLERI DERGISI, 2016, 22 (02): : 100 - 105
[47] Improving Large Vocabulary Accented Mandarin Speech Recognition with Attribute-based I-vectors
Zheng, Hao
Zhang, Shanshan
Qiao, Liwei
Lie, Jianping
Liu, Wenju
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3454 - 3458
[48] Incremental Text-to-Speech Synthesis Using Pseudo Lookahead With Large Pretrained Language Model
Saeki, Takaaki
Takamichi, Shinnosuke
Saruwatari, Hiroshi
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 857 - 861
[49] A usage of the syllable unit based on morphological statistics in Korean large vocabulary continuous speech recognition system
Hyok-Chol Ri
International Journal of Speech Technology, 2019, 22 : 971 - 977
[50] A usage of the syllable unit based on morphological statistics in Korean large vocabulary continuous speech recognition system
Ri, Hyok-Chol
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (04) : 971 - 977

← 1 2 3 4 5 →