Mongolian Speech Recognition Based on Deep Neural Networks

被引：7

作者：

Zhang, Hui ^{[1
]}

Bao, Feilong ^{[1
]}

Gao, Guanglai ^{[1
]}

机构：

[1] Inner Mongolia Univ, Coll Comp Sci, Hohhot 010021, Peoples R China

来源：

CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA (CCL 2015) | 2015年 / 9427卷

关键词：

Mongolian; Deep Neural Networks (DNNs); Gaussian Mixture Models (GMMs); N-gram language model;

D O I：

10.1007/978-3-319-25816-4_15

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Mongolian is an influential language. And better Mongolian Large Vocabulary Continuous Speech Recognition (LVCSR) systems are required. Recently, the research of speech recognition has achieved a big improvement by introducing the Deep Neural Networks (DNNs). In this study, a DNN-based Mongolian LVCSR system is built. Experimental results show that the DNN-based models outperform the conventional models which based on Gaussian Mixture Models (GMMs) for the Mongolian speech recognition, by a large margin. Compared with the best GMM-based model, the DNN-based one obtains a relative improvement over 50 %. And it becomes a new state-of-the-art system in this field.

引用

页码：180 / 188

页数：9

共 22 条

[1]

[Anonymous], 2009, Ethnologue: languages of the world

[2]

[Anonymous], 2015, ARXIV150401482

[3]

Ayush Altangerel, 2013, 2013 8th International Forum on Strategic Technology (IFOST), P341, DOI 10.1109/IFOST.2013.6616910

[4]

Bao Feilong, 2014, Computer Engineering and Applications, V50, P206, DOI 10.3778/j.issn.1002-8331.1301-0314

[5]

Bao FL, 2013, INT CONF ACOUST SPEE, P8136, DOI 10.1109/ICASSP.2013.6639250

[6]

Bao FL, 2013, COMM COM INF SC, V400, P13

[7]

Bao FL, 2009, PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, P616

[8]

Bengio Y, 2006, STUD FUZZ SOFT COMP, V194, P137

[9] COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].

DAVIS, SB ;

MERMELSTEIN, P .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366

[10] VITERBI ALGORITHM [J].

FORNEY, GD .

PROCEEDINGS OF THE IEEE, 1973, 61 (03) :268-278

← 1 2 3 →