Persian speech recognition using deep learning

被引：1

作者：

Hadi Veisi

Armita Haji Mani

机构：

[1] University of Tehran,Faculty of New Sciences and Technologies (FNST)

来源：

International Journal of Speech Technology | 2020年 / 23卷

关键词：

Persian speech recognition; Bidirectional long short-term memory neural network; Deep neural network; Deep belief network;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Up to now, various methods are used for Automatic Speech Recognition (ASR), and among which the Hidden Markov Model (HMM) and Artificial Neural Networks (ANNs) are the most important ones. One of the existing challenges is increasing the accuracy and efficiency of these systems. One way to enhance the accuracy of them is by improving the acoustic model (AM). In this paper, for the first time, the combination of deep belief network (DBN), for extracting features of speech signals, and Deep Bidirectional Long Short-Term Memory (DBLSTM) with Connectionist Temporal Classification (CTC) output layer is used to create an AM on the Farsdat Persian speech data set. The obtained results show that the use of a deep neural network (DNN) compared to a shallow network improves the results. Also, using the bidirectional network increases the accuracy of the model in comparison with the unidirectional network, in both deep and shallow networks. Comparing obtained results with the HMM and Kaldi-DNN indicates that using DBLSTM with features extracted from the DBN increases the accuracy of Persian phoneme recognition.

引用

页码：893 / 905

页数：12

共 50 条

[1] Persian speech recognition using deep learning
Veisi, Hadi
Haji Mani, Armita
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (04) : 893 - 905
[2] Speech Recognition using Deep Learning
Lakkhanawannakun, Phoemporn
Noyunsan, Chaluemwut
2019 34TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2019), 2019, : 514 - 517
[3] Korean speech recognition using deep learning
Lee, Suji
Han, Seokjin
Park, Sewon
Lee, Kyeongwon
Lee, Jaeyong
KOREAN JOURNAL OF APPLIED STATISTICS, 2019, 32 (02) : 213 - 227
[4] Speech Emotion Recognition Using Deep Learning
Alagusundari, N.
Anuradha, R.
ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, VOL 1, AITA 2023, 2024, 843 : 313 - 325
[5] Speech Command Recognition Using Deep Learning
Ayache, Mohammad
Kanaan, Hussien
Kassir, Kawthar
Kassir, Yasser
2021 SIXTH INTERNATIONAL CONFERENCE ON ADVANCES IN BIOMEDICAL ENGINEERING (ICABME), 2021, : 24 - 29
[6] Speech Emotion Recognition Using Deep Learning
Ahmed, Waqar
Riaz, Sana
Iftikhar, Khunsa
Konur, Savas
ARTIFICIAL INTELLIGENCE XL, AI 2023, 2023, 14381 : 191 - 197
[7] Fake Speech Recognition Using Deep Learning
Camacho, Steven
Maria Ballesteros, Dora
Renza, Diego
APPLIED COMPUTER SCIENCES IN ENGINEERING, WEA 2021, 2021, 1431 : 38 - 48
[8] Recognition of English speech - using a deep learning algorithm
Wang, Shuyan
JOURNAL OF INTELLIGENT SYSTEMS, 2023, 32 (01)
[9] Kannada Continuous Speech Recognition Using Deep Learning
Paul, Shubhojeet
Bhattacharjee, Vandana
Saha, Sujan Kumar
ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2023, PT IV, 2024, 2093 : 258 - 269
[10] Persian handwritten digit, character and word recognition using deep learning
Mahdi Bonyani
Simindokht Jahangard
Morteza Daneshmand
International Journal on Document Analysis and Recognition (IJDAR), 2021, 24 : 133 - 143

← 1 2 3 4 5 →