Regularized Urdu Speech Recognition with Semi-Supervised Deep Learning

被引:11
|
作者
Humayun, Mohammad Ali [1 ]
Hameed, Ibrahim A. [2 ]
Shah, Syed Muslim [1 ]
Khan, Sohaib Hassan [1 ]
Zafar, Irfan [1 ]
Bin Ahmed, Saad [3 ]
Shuja, Junaid [4 ]
机构
[1] Univ Engn & Technol Peshawar, Dept Elect Engn, Inst Commun Technol ICT Campus, Islamabad 44000, Pakistan
[2] Norwegian Univ Sci & Technol, Fac Informat Technol & Elect Engn, Dept ICT & Nat Sci, N-6001 Alesund, Norway
[3] Univ Teknol Malaysia, M JIIT, Jalan Sultan Yahya Petra, Kuala Lumpur 54100, Malaysia
[4] COSMATS Univ Islamabad, Dept Comp Sci, Abbottabad Campus, Abbottabad 22010, Pakistan
来源
APPLIED SCIENCES-BASEL | 2019年 / 9卷 / 09期
关键词
speech recognition; locally linear embedding; label propagation; Maxout; low resource languages;
D O I
10.3390/app9091956
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Automatic Speech Recognition, (ASR) has achieved the best results for English, with end-to-end neural network based supervised models. These supervised models need huge amounts of labeled speech data for good generalization, which can be quite a challenge to obtain for low-resource languages like Urdu. Most models proposed for Urdu ASR are based on Hidden Markov Models (HMMs). This paper proposes an end-to-end neural network model, for Urdu ASR, regularized with dropout, ensemble averaging and Maxout units. Dropout and ensembles are averaging techniques over multiple neural network models while Maxout are units in a neural network which adapt their activation functions. Due to limited labeled data, Semi Supervised Learning (SSL) techniques are also incorporated to improve model generalization. Speech features are transformed into a lower dimensional manifold using an unsupervised dimensionality-reduction technique called Locally Linear Embedding (LLE). Transformed data along with higher dimensional features is used to train neural networks. The proposed model also utilizes label propagation-based self-training of initially trained models and achieves a Word Error Rate (WER) of 4% less than that reported as the benchmark on the same Urdu corpus using HMM. The decrease in WER after incorporating SSL is more significant with an increased validation data size.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Large-Scale Semi-Supervised Training in Deep Learning Acoustic Model for ASR
    Long, Yanhua
    Li, Yijie
    Wei, Shuang
    Zhang, Qiaozheng
    Yang, Chunxia
    IEEE ACCESS, 2019, 7 : 133615 - 133627
  • [42] Cyclic label propagation for graph semi-supervised learning
    Li, Zhao
    Liu, Yixin
    Zhang, Zhen
    Pan, Shirui
    Gao, Jianliang
    Bu, Jiajun
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (02): : 703 - 721
  • [43] Manifold Coarse Graining for Online Semi-supervised Learning
    Farajtabar, Mehrdad
    Shaban, Amirreza
    Rabiee, Hamid Reza
    Rohban, Mohammad Hossein
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2011, 6911 : 391 - 406
  • [44] Fuzzy weighted sparse reconstruction error-steered semi-supervised learning for face recognition
    Liu, Li
    Chen, Siqi
    Chen, Xiuxiu
    Wang, Tianshi
    Zhang, Long
    VISUAL COMPUTER, 2020, 36 (08) : 1521 - 1534
  • [45] Graph-based semi-supervised learning: A review
    Chong, Yanwen
    Ding, Yun
    Yan, Qing
    Pan, Shaoming
    NEUROCOMPUTING, 2020, 408 (408) : 216 - 230
  • [46] Semi-Supervised Feature Selection with Adaptive Graph Learning
    Jiang B.-B.
    He W.-D.
    Wu X.-Y.
    Xiang J.-H.
    Hong L.-B.
    Sheng W.-G.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2022, 50 (07): : 1643 - 1652
  • [47] Particle Competition and Cooperation in Networks for Semi-Supervised Learning
    Breve, Fabricio
    Zhao, Liang
    Quiles, Marcos
    Pedrycz, Witold
    Liu, Jiming
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (09) : 1686 - 1698
  • [48] Semi-supervised Graph Learning with Few Labeled Nodes
    Zhang, Cong
    Bai, Ting
    Wu, Bin
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT II, 2022, : 423 - 438
  • [49] Acoustic Model Bootstrapping Using Semi-Supervised Learning
    Chen, Langzhou
    Leutnant, Volker
    INTERSPEECH 2019, 2019, : 3198 - 3202
  • [50] Cyclic label propagation for graph semi-supervised learning
    Zhao Li
    Yixin Liu
    Zhen Zhang
    Shirui Pan
    Jianliang Gao
    Jiajun Bu
    World Wide Web, 2022, 25 : 703 - 721