Noise Robust End-to-End Speech Recognition For Bangla Language

被引:0
作者
Sumit, Sakhawat Hosain [1 ]
Al Muntasir, Tareq [1 ]
Zaman, M. M. Arefin [1 ]
Nandi, Rabindra Nath [1 ]
Sourov, Tanvir [1 ]
机构
[1] Socian Ltd, Dhaka, Bangladesh
来源
2018 INTERNATIONAL CONFERENCE ON BANGLA SPEECH AND LANGUAGE PROCESSING (ICBSLP) | 2018年
关键词
End-to-End; Noise Robust; Speech Recognition; CTC; Bangla Language; NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Robust speech recognition system is crucial for real-world applications and speech signal generally contains considerable amount of noise. We propose an end-to-end deep learning approach leveraging current progresses in Automatic Speech Recognition system to recognize continuous Bangla speech for noisy environments. We improve the robustness of our model through data augmentation and deep model architecture. We evaluate our model on an internal and two available datasets. We achieve impressive result on both clean read and noisy speech data. Our model achieves 10.65 % and 34.83 % CER on CRBLP (clean read) and Babel (noisy) respectively, which is state of the art for continuous Bangla speech recognition to the best of our knowledge. For our internal data set, we achieved 12.31 % and 9.15 % CER on clean and noisy speech respectively.
引用
收藏
页数:5
相关论文
共 43 条
  • [1] Convolutional Neural Networks for Speech Recognition
    Abdel-Hamid, Ossama
    Mohamed, Abdel-Rahman
    Jiang, Hui
    Deng, Li
    Penn, Gerald
    Yu, Dong
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1533 - 1545
  • [2] Ahmed M, 2015, 2015 18TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), P306, DOI 10.1109/ICCITechn.2015.7488087
  • [3] Alam F, 2010, SPOKEN LANGUAGES TEC
  • [4] Ali M. A., 2013, INT J ADV SCI TECHNO, V50
  • [5] [Anonymous], 2012, INT C MACHINE LEARNI
  • [6] [Anonymous], 2014, Advances in neural information processing systems
  • [7] [Anonymous], bdnews24
  • [8] [Anonymous], Prothom Alo
  • [9] [Anonymous], 1997, Neural Computation
  • [10] [Anonymous], 2017, ARXIV171201769