Syllable Based Language Model for Large Vocabulary Continuous Speech Recognition of Polish

被引:0
作者
Majewski, Piotr [1 ]
机构
[1] Univ Lodz, Fac Math & Comp Sci, PL-90238 Lodz, Poland
来源
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS | 2008年 / 5246卷
关键词
Polish; large vocabulary continuous speech recognition; language modeling; sub-word units; syllable-based units;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most of state-of-the-art large vocabulary continuous speech recognition systems use word-based n-gram language models. Such models are not optimal solution for inflectional or agglutinative languages. The Polish language is highly inflectional one and requires a very large corpora to create a sufficient language model with the small out-of-vocabulary ratio. We propose a syllable-based language model. which is better suited to highly inflectional language like Polish. In case of lack of resources (i.e. small corpora) syllable-based model outperforms word-based models in terms of number of out-of-vocabulary units (syllables in our model). Such model is an approximation of the morphene-based model for Polish. In our paper, we show results of evaluation of syllable based model and its usefulness in speech recognition tasks.
引用
收藏
页码:397 / 401
页数:5
相关论文
共 50 条
[21]   A Comparison of Language Model Training Techniques in a Continuous Speech Recognition System for Serbian [J].
Popovic, Branislav ;
Pakoci, Edvin ;
Pekar, Darko .
SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 :522-531
[22]   SARMATA 2.0 Automatic Polish Language Speech Recognition System [J].
Ziolko, Bartosz ;
Jadczyk, Tomasz ;
Skurzok, Dawid ;
Zelasko, Piotr ;
Galka, Jakub ;
Pedzimaz, Tomasz ;
Gawlik, Ireneusz ;
Palka, Szymon .
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, :1062-+
[23]   LARGE LANGUAGE MODEL BASED GENERATIVE ERROR CORRECTION: A CHALLENGE AND BASELINES FOR SPEECH RECOGNITION, SPEAKER TAGGING, AND EMOTION RECOGNITION [J].
Yang, Chao-Han Huck ;
Park, Taejin ;
Gong, Yuan ;
Li, Yuanchao ;
Chen, Zhehuai ;
Lin, Yen-Ting ;
Chen, Chen ;
Hu, Yuchen ;
Dhawan, Kunal ;
Zelasko, Piotr ;
Zhang, Chao ;
Chen, Yun-Nung ;
Tsao, Yu ;
Balam, Jagadeesh ;
Ginsburg, Boris ;
Siniscalchi, Sabato Marco ;
Chng, Eng Siong ;
Bell, Peter ;
Lai, Catherine ;
Watanabe, Shinji ;
Stolcke, Andreas .
2024 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2024, :371-378
[24]   Combination of Random Indexing based Language Model and N-gram Language Model for Speech Recognition [J].
Fohr, Dominique ;
Mella, Odile .
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, :2231-2235
[25]   LANGUAGE MODEL VERBALIZATION FOR AUTOMATIC SPEECH RECOGNITION [J].
Sak, Hasim ;
Beaufays, Francoise ;
Nakajima, Kaisuke ;
Allauzen, Cyril .
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, :8262-8266
[26]   Improving Large Vocabulary Accented Mandarin Speech Recognition with Attribute-based I-vectors [J].
Zheng, Hao ;
Zhang, Shanshan ;
Qiao, Liwei ;
Lie, Jianping ;
Liu, Wenju .
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, :3454-3458
[27]   Bayesian Learning of a Language Model from Continuous Speech [J].
Neubig, Graham ;
Mimura, Masato ;
Mori, Shinsuke ;
Kawahara, Tatsuya .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (02) :614-625
[28]   CORRECTION FOCUSED LANGUAGE MODEL TRAINING FOR SPEECH RECOGNITION [J].
Ma, Yingyi ;
Liu, Zhe ;
Kalinli, Ozlem .
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024), 2024, :10856-10860
[29]   Language Model Optimization for a Deep Neural Network Based Speech Recognition System for Serbian [J].
Pakoci, Edvin ;
Popovic, Branislav ;
Pekar, Darko .
SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 :483-492
[30]   A generalized dynamic composition algorithm of weighted finite state transducers for large vocabulary speech recognition [J].
Cheng, Octavian ;
Dines, John ;
Doss, Mathew Magimai .
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, :345-+