HindiSpeech-Net: a deep learning based robust automatic speech recognition system for Hindi language

被引：4

作者：

Sharma, Usha ^{[1
]}

Om, Hari ^{[1
]}

Mishra, A. N. ^{[2
]}

机构：

[1] Indian Inst Technol ISM Dhanbad, Dept Comp Sci & Engn, Dhanbad 826004, Bihar, India

[2] Krishna Engn Coll, Ghaziabad 201001, India

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2023年 / 82卷 / 11期

关键词：

1D-CNN; Convolutional neural network; Hindi language; Deep learning; Speech recognition; FEATURE-EXTRACTION;

D O I：

10.1007/s11042-022-14019-z

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Automatic Speech Recognition (ASR) has become one of the major research areas over the past decade and gained a lot of interest. Their system implementation, adaptation to different languages and robustness in the performance are still some of the major challenges. Hindi is one of the most widely spoken languages in the world but it is a complex and resource-constraint language. Thus, speech recognition and classification systems need to be developed for Hindi language to spread the technology and to explore more communication means. But due to its language complexity than other languages and lack of standard databases, it is quite challenging to develop such systems. Deep learning is extensively used in different research fields and has proven its prominence to a broader extent. In this paper, a seven-layer 1D-convolutional neural network HindiSpeech-Net has been proposed to recognise different speech samples of the Hindi language in the respective category. A large dataset of 2400 speech samples in the Hindi language is collected in ten different classes in real-world conditions which is further accompanied by signal filtering and augmentation to enhance the dataset for making a robust model and avoid overfitting. The collected dataset is divided into training, validation and test set which were evaluated in different performance parameters. The trained HindiSpeech-Net model achieved an accuracy of 92.92% on the test set. The proposed framework is computationally less expensive, works in real-time and is suitable for implementation in embedded systems.

引用

页码：16173 / 16193

页数：21

共 50 条

[21] Skrybot - A System for Automatic Speech Recognition of Polish Language
Pawlaczyk, Leslaw
Bosky, Pawel
MAN-MACHINE INTERACTIONS, 2009, 59 : 381 - +
[22] A deep neural network-based model for named entity recognition for Hindi language
Richa Sharma
Sudha Morwal
Basant Agarwal
Ramesh Chandra
Mohammad S. Khan
Neural Computing and Applications, 2020, 32 : 16191 - 16203
[23] Investigating a Hybrid Learning Approach for Robust Automatic Speech Recognition
Pironkov, Gueorgui
Wood, Sean U. N.
Dupont, Stephane
Dutoit, Thierry
STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2018, 2018, 11171 : 67 - 78
[24] LEARNING MASK SCALARS FOR IMPROVED ROBUST AUTOMATIC SPEECH RECOGNITION
Narayanan, Arun
Walker, James
Panchapagesan, Sankaran
Howard, Nathan
Koizumi, Yuma
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 317 - 323
[25] A deep neural network-based model for named entity recognition for Hindi language
Sharma, Richa
Morwal, Sudha
Agarwal, Basant
Chandra, Ramesh
Khan, Mohammad S.
NEURAL COMPUTING & APPLICATIONS, 2020, 32 (20) : 16191 - 16203
[26] Automatic Leaf Recognition Based on Deep Semi-Supervised Learning
Wu H.
Xiao F.
Shi Z.
Wen Z.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (10): : 1469 - 1478
[27] An efficient deep learning approach for automatic speech recognition using EEG signals
Chinta, Babu
Pampana, Madhuri
Moorthi, M.
COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING, 2025,
[28] Study of Deep Learning and CMU Sphinx in Automatic Speech Recognition
Dhankar, Abhishek
2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 2296 - 2301
[29] Automatic Speech Recognition: A survey of deep learning techniques and approaches
Ahlawat, Harsh
Aggarwal, Naveen
Gupta, Deepti
International Journal of Cognitive Computing in Engineering, 2025, 6 : 201 - 237
[30] Development of Hindi speech recognition system of agricultural commodities using deep neural network
Mandal, Partho
Jain, Shalini
Ojha, Gaurav
Shukla, Anupam
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1241 - 1245

← 1 2 3 4 5 →