Persian speech recognition using deep learning

被引:1
|
作者
Hadi Veisi
Armita Haji Mani
机构
[1] University of Tehran,Faculty of New Sciences and Technologies (FNST)
来源
International Journal of Speech Technology | 2020年 / 23卷
关键词
Persian speech recognition; Bidirectional long short-term memory neural network; Deep neural network; Deep belief network;
D O I
暂无
中图分类号
学科分类号
摘要
Up to now, various methods are used for Automatic Speech Recognition (ASR), and among which the Hidden Markov Model (HMM) and Artificial Neural Networks (ANNs) are the most important ones. One of the existing challenges is increasing the accuracy and efficiency of these systems. One way to enhance the accuracy of them is by improving the acoustic model (AM). In this paper, for the first time, the combination of deep belief network (DBN), for extracting features of speech signals, and Deep Bidirectional Long Short-Term Memory (DBLSTM) with Connectionist Temporal Classification (CTC) output layer is used to create an AM on the Farsdat Persian speech data set. The obtained results show that the use of a deep neural network (DNN) compared to a shallow network improves the results. Also, using the bidirectional network increases the accuracy of the model in comparison with the unidirectional network, in both deep and shallow networks. Comparing obtained results with the HMM and Kaldi-DNN indicates that using DBLSTM with features extracted from the DBN increases the accuracy of Persian phoneme recognition.
引用
收藏
页码:893 / 905
页数:12
相关论文
共 50 条
  • [1] Persian speech recognition using deep learning
    Veisi, Hadi
    Haji Mani, Armita
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (04) : 893 - 905
  • [2] Speech Recognition using Deep Learning
    Lakkhanawannakun, Phoemporn
    Noyunsan, Chaluemwut
    2019 34TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2019), 2019, : 514 - 517
  • [3] Korean speech recognition using deep learning
    Lee, Suji
    Han, Seokjin
    Park, Sewon
    Lee, Kyeongwon
    Lee, Jaeyong
    KOREAN JOURNAL OF APPLIED STATISTICS, 2019, 32 (02) : 213 - 227
  • [4] Speech Emotion Recognition Using Deep Learning
    Alagusundari, N.
    Anuradha, R.
    ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, VOL 1, AITA 2023, 2024, 843 : 313 - 325
  • [5] Speech Command Recognition Using Deep Learning
    Ayache, Mohammad
    Kanaan, Hussien
    Kassir, Kawthar
    Kassir, Yasser
    2021 SIXTH INTERNATIONAL CONFERENCE ON ADVANCES IN BIOMEDICAL ENGINEERING (ICABME), 2021, : 24 - 29
  • [6] Speech Emotion Recognition Using Deep Learning
    Ahmed, Waqar
    Riaz, Sana
    Iftikhar, Khunsa
    Konur, Savas
    ARTIFICIAL INTELLIGENCE XL, AI 2023, 2023, 14381 : 191 - 197
  • [7] Fake Speech Recognition Using Deep Learning
    Camacho, Steven
    Maria Ballesteros, Dora
    Renza, Diego
    APPLIED COMPUTER SCIENCES IN ENGINEERING, WEA 2021, 2021, 1431 : 38 - 48
  • [8] Recognition of English speech - using a deep learning algorithm
    Wang, Shuyan
    JOURNAL OF INTELLIGENT SYSTEMS, 2023, 32 (01)
  • [9] Kannada Continuous Speech Recognition Using Deep Learning
    Paul, Shubhojeet
    Bhattacharjee, Vandana
    Saha, Sujan Kumar
    ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2023, PT IV, 2024, 2093 : 258 - 269
  • [10] Persian handwritten digit, character and word recognition using deep learning
    Mahdi Bonyani
    Simindokht Jahangard
    Morteza Daneshmand
    International Journal on Document Analysis and Recognition (IJDAR), 2021, 24 : 133 - 143