Improving Large Vocabulary Urdu Speech Recognition System using Deep Neural Networks

被引：6

作者：

Farooq, Muhammad Umar ^{[1
]}

Adeeba, Farah ^{[1
]}

Rauf, Sahar ^{[1
]}

Hussain, Sarmad ^{[1
]}

机构：

[1] Univ Engn & Technol, Ctr Language Engn, Al Khawarizmi Inst Comp Sci, Lahore, Pakistan

来源：

INTERSPEECH 2019 | 2019年

关键词：

Urdu; ASR; GMM-HMM; DNN-HMM; TDNN; BLSTM; RNNLM; LVCSR;

D O I：

10.21437/Interspeech.2019-2629

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

Development of Large Vocabulary Continuous Speech Recognition (LVCSR) system is a cumbersome task, especially for low resource languages. Urdu is the national language and lingua franca of Pakistan, with 100 million speakers worldwide. Due to resource scarcity, limited work has been done in the domain of Urdu speech recognition. In this paper, collection of Urdu speech corpus and development of Urdu speech recognition system is presented. Urdu LVCSR is developed using 300 hours of read speech data with a vocabulary size of 199K words. Microphone speech is recorded from 1671 Urdu and Punjabi speakers in both indoor and outdoor environments. Different acoustic modeling techniques such as Gaussian Mixture Models based Hidden Markov Models (GMM-HMM), Time Delay Neural Networks (TDNN), Long-Short Term Memory (LSTM) and Bidirectional Long-Short Term Memory (BLSTM) networks are investigated. Cross entropy and Lattice Free Maximum Mutual Information (LF-MMI) objective functions are employed during acoustic modeling. In addition, Recurrent Neural Network Language Model (RNNLM) is also being used for re-scoring. Developed speech recognition system has been evaluated on 9.5 hours of collected test data and a minimum Word Error Rate (%WER) of 13.50% is achieved.

引用

页码：2978 / 2982

页数：5

共 50 条

[1] Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks
Yu, Dong
Deng, Li
Seide, Frank
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 6 - 9
[2] Large Vocabulary Speech Recognition Using Deep Neural Networks: Insights, Theory, and Practice
Yu, Dong
2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : XXXI - XXXI
[3] Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition
Jaitly, Navdeep
Patrick Nguyen
Senior, Andrew
Vanhoucke, Vincent
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2577 - 2580
[4] Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition
Wu, Jibin
Yilmaz, Emre
Zhang, Malu
Li, Haizhou
Tan, Kay Chen
FRONTIERS IN NEUROSCIENCE, 2020, 14
[5] EXPLOITING SPARSENESS IN DEEP NEURAL NETWORKS FOR LARGE VOCABULARY SPEECH RECOGNITION
Yu, Dong
Seide, Frank
Li, Gang
Deng, Li
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4409 - 4412
[6] LARGE VOCABULARY SPEECH RECOGNITION USING NEURAL-FUZZY AND CONCEPT NETWORKS
HATAOKA, N
AMANO, A
ARITSUKA, T
ICHIKAWA, A
LECTURE NOTES IN COMPUTER SCIENCE, 1990, 412 : 186 - 196
[7] The Deep Tensor Neural Network With Applications to Large Vocabulary Speech Recognition
Yu, Dong
Deng, Li
Seide, Frank
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (02): : 388 - 396
[8] A CLUSTER-BASED MULTIPLE DEEP NEURAL NETWORKS METHOD FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
Zhou, Pan
Liu, Cong
Liu, Qingfeng
Dai, Lirong
Jiang, Hui
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6650 - 6654
[9] Lexical Intent Recognition in Urdu Queries Using Deep Neural Networks
Shams, Sana
Aslam, Muhammad
Maria Martinez-Enriquez, Ana
ADVANCES IN SOFT COMPUTING, MICAI 2019, 2019, 11835 : 39 - 50
[10] Isolated Word Speech Recognition System Using Deep Neural Networks
Dhanashri, Dhavale
Dhonde, S. B.
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT 2016, VOL 1, 2017, 468 : 9 - 17

← 1 2 3 4 5 →