HindiSpeech-Net: a deep learning based robust automatic speech recognition system for Hindi language

被引:4
作者
Sharma, Usha [1 ]
Om, Hari [1 ]
Mishra, A. N. [2 ]
机构
[1] Indian Inst Technol ISM Dhanbad, Dept Comp Sci & Engn, Dhanbad 826004, Bihar, India
[2] Krishna Engn Coll, Ghaziabad 201001, India
关键词
1D-CNN; Convolutional neural network; Hindi language; Deep learning; Speech recognition; FEATURE-EXTRACTION;
D O I
10.1007/s11042-022-14019-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automatic Speech Recognition (ASR) has become one of the major research areas over the past decade and gained a lot of interest. Their system implementation, adaptation to different languages and robustness in the performance are still some of the major challenges. Hindi is one of the most widely spoken languages in the world but it is a complex and resource-constraint language. Thus, speech recognition and classification systems need to be developed for Hindi language to spread the technology and to explore more communication means. But due to its language complexity than other languages and lack of standard databases, it is quite challenging to develop such systems. Deep learning is extensively used in different research fields and has proven its prominence to a broader extent. In this paper, a seven-layer 1D-convolutional neural network HindiSpeech-Net has been proposed to recognise different speech samples of the Hindi language in the respective category. A large dataset of 2400 speech samples in the Hindi language is collected in ten different classes in real-world conditions which is further accompanied by signal filtering and augmentation to enhance the dataset for making a robust model and avoid overfitting. The collected dataset is divided into training, validation and test set which were evaluated in different performance parameters. The trained HindiSpeech-Net model achieved an accuracy of 92.92% on the test set. The proposed framework is computationally less expensive, works in real-time and is suitable for implementation in embedded systems.
引用
收藏
页码:16173 / 16193
页数:21
相关论文
共 50 条
  • [21] Skrybot - A System for Automatic Speech Recognition of Polish Language
    Pawlaczyk, Leslaw
    Bosky, Pawel
    MAN-MACHINE INTERACTIONS, 2009, 59 : 381 - +
  • [22] A deep neural network-based model for named entity recognition for Hindi language
    Richa Sharma
    Sudha Morwal
    Basant Agarwal
    Ramesh Chandra
    Mohammad S. Khan
    Neural Computing and Applications, 2020, 32 : 16191 - 16203
  • [23] Investigating a Hybrid Learning Approach for Robust Automatic Speech Recognition
    Pironkov, Gueorgui
    Wood, Sean U. N.
    Dupont, Stephane
    Dutoit, Thierry
    STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2018, 2018, 11171 : 67 - 78
  • [24] LEARNING MASK SCALARS FOR IMPROVED ROBUST AUTOMATIC SPEECH RECOGNITION
    Narayanan, Arun
    Walker, James
    Panchapagesan, Sankaran
    Howard, Nathan
    Koizumi, Yuma
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 317 - 323
  • [25] A deep neural network-based model for named entity recognition for Hindi language
    Sharma, Richa
    Morwal, Sudha
    Agarwal, Basant
    Chandra, Ramesh
    Khan, Mohammad S.
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (20) : 16191 - 16203
  • [26] Automatic Leaf Recognition Based on Deep Semi-Supervised Learning
    Wu H.
    Xiao F.
    Shi Z.
    Wen Z.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (10): : 1469 - 1478
  • [27] An efficient deep learning approach for automatic speech recognition using EEG signals
    Chinta, Babu
    Pampana, Madhuri
    Moorthi, M.
    COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING, 2025,
  • [28] Study of Deep Learning and CMU Sphinx in Automatic Speech Recognition
    Dhankar, Abhishek
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 2296 - 2301
  • [29] Automatic Speech Recognition: A survey of deep learning techniques and approaches
    Ahlawat, Harsh
    Aggarwal, Naveen
    Gupta, Deepti
    International Journal of Cognitive Computing in Engineering, 2025, 6 : 201 - 237
  • [30] Development of Hindi speech recognition system of agricultural commodities using deep neural network
    Mandal, Partho
    Jain, Shalini
    Ojha, Gaurav
    Shukla, Anupam
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1241 - 1245