Methods for Rapid Development of Automatic Speech Recognition System for Russian

被引:0
作者
Safarik, Radek [1 ]
Nouza, Jan [1 ]
机构
[1] Tech Univ Liberec, Fac Mechatron, Inst Informat Technol & Elect, Liberec, Czech Republic
来源
2015 IEEE INTERNATIONAL WORKSHOP OF ELECTRONICS, CONTROL, MEASUREMENT, SIGNALS AND THEIR APPLICATION TO MECHATRONICS (ECMSM) | 2015年
关键词
speech recognition; multi-lingual; Russian; language model; acoustic model; CZECH;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we present our approach to the rapid and efficient development of an automatic speech recognition (ASR) system for Russian. We try to utilize our tools, procedures and data previously designed and collected for other Slavic languages, Czech and Slovak. We show how we build a large corpus of texts acquired from major publishers' web pages and convert it from Cyrillic to Latin to simplify further processing. The corpus is used to create a representative lexicon with 218K words and 259K pronunciations and a probabilistic language model. When training the acoustic model (AM), we use the GlobalPhone database of recordings and a largely automated scheme that includes bootstrapping with an existing Czech AM and several iterative steps to gradually improve both phonetic annotations and the target Russian AM. The recent prototype of the Russian ASR system is evaluated on the test part of the GlobalPhone database and achieves 18.2 % word error rate.
引用
收藏
页数:6
相关论文
共 16 条
[1]  
Cerva P, 2011, LECT NOTES COMPUT SC, V6800, P81, DOI 10.1007/978-3-642-25775-9_7
[2]  
Hamilton W.S., 1980, INTRO RUSSIAN PHONOL
[3]   Large vocabulary Russian speech recognition using syntactico-statistical language modeling [J].
Karpov, Alexey ;
Markov, Konstantin ;
Kipyatkova, Irina ;
Vazhenina, Dania ;
Ronzhin, Andrey .
SPEECH COMMUNICATION, 2014, 56 :213-228
[4]  
Kolorenc J., 2006, SPEECH COMP INT C SP, P70
[5]  
Nouza J., 1997, Radioengineering, V6, P16
[6]  
Nouza J., 2014, P TEL SIGN PROC TSP, P437
[7]  
NOUZA J, 2014, P INT, P964
[8]  
Nouza J, 2013, RADIOENGINEERING, V22, P866
[9]  
Nouza J, 2010, LECT NOTES COMPUT SC, V5967, P225
[10]  
Nouza J, 2009, INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, P995